VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching

Junyu Bi, Daixuan Cheng, Ping Yao, Bochen Pang, Yuefeng Zhan, Chuanguang Yang, Yujing Wang, Hao Sun, Weiwei Deng, Qi Zhang. VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching. In IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. pages 2584-2593, IEEE, 2023. [doi]

Abstract

Abstract is missing.