Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning

Zhicheng Huang, Zhaoyang Zeng, Yupan Huang, Bei Liu 0001, Dongmei Fu, Jianlong Fu. Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. pages 12976-12985, Computer Vision Foundation / IEEE, 2021. [doi]

Abstract

Abstract is missing.