Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning

Zhicheng Huang, Zhaoyang Zeng, Yupan Huang, Bei Liu 0001, Dongmei Fu, Jianlong Fu. Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. pages 12976-12985, Computer Vision Foundation / IEEE, 2021. [doi]

@inproceedings{HuangZH0FF21,
  title = {Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning},
  author = {Zhicheng Huang and Zhaoyang Zeng and Yupan Huang and Bei Liu 0001 and Dongmei Fu and Jianlong Fu},
  year = {2021},
  url = {https://openaccess.thecvf.com/content/CVPR2021/html/Huang_Seeing_Out_of_the_Box_End-to-End_Pre-Training_for_Vision-Language_Representation_CVPR_2021_paper.html},
  researchr = {https://researchr.org/publication/HuangZH0FF21},
  cites = {0},
  citedby = {0},
  pages = {12976-12985},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021},
  publisher = {Computer Vision Foundation / IEEE},
}