ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data

Maya Varma, Jean-Benoit Delbrouck, Sarah M. Hooper, Akshay Chaudhari, Curtis P. Langlotz. ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data. In IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. pages 22168-22178, IEEE, 2023. [doi]

Abstract

Abstract is missing.