Teaching Structured Vision & Language Concepts to Vision & Language Models

Sivan Doveh, Assaf Arbelle, Sivan Harary, Eli Schwartz, Roei Herzig, Raja Giryes, Rogério Feris, Rameswar Panda, Shimon Ullman, Leonid Karlinsky. Teaching Structured Vision & Language Concepts to Vision & Language Models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pages 2657-2668, IEEE, 2023. [doi]

Abstract

Abstract is missing.