MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model

Yatai Ji, Junjie Wang, Yuan Gong, Lin Zhang, Yanru Zhu, Hongfa Wang, Jiaxing Zhang, Tetsuya Sakai, Yujiu Yang. MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pages 23262-23271, IEEE, 2023. [doi]

Abstract

Abstract is missing.