Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut. Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. pages 3558-3568, Computer Vision Foundation / IEEE, 2021. [doi]

@inproceedings{ChangpinyoSDS21,
  title = {Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts},
  author = {Soravit Changpinyo and Piyush Sharma and Nan Ding and Radu Soricut},
  year = {2021},
  url = {https://openaccess.thecvf.com/content/CVPR2021/html/Changpinyo_Conceptual_12M_Pushing_Web-Scale_Image-Text_Pre-Training_To_Recognize_Long-Tail_Visual_CVPR_2021_paper.html},
  researchr = {https://researchr.org/publication/ChangpinyoSDS21},
  cites = {0},
  citedby = {0},
  pages = {3558-3568},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021},
  publisher = {Computer Vision Foundation / IEEE},
}