Le Zhang, Rabiul Awal, Aishwarya Agrawal. Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pages 13774-13784, IEEE, 2024. [doi]
No references recorded for this publication.
No citations of this publication recorded.