VideoCon: Robust Video-Language Alignment via Contrast Captions

Hritik Bansal, Yonatan Bitton, Idan Szpektor, Kai-Wei Chang, Aditya Grover. VideoCon: Robust Video-Language Alignment via Contrast Captions. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pages 13927-13937, IEEE, 2024. [doi]

Abstract

Abstract is missing.