Contrastive Region Guidance: Improving Grounding in Vision-Language Models Without Training

David Wan, Jaemin Cho 0001, Elias Stengel-Eskin, Mohit Bansal. Contrastive Region Guidance: Improving Grounding in Vision-Language Models Without Training. In Ales Leonardis, Elisa Ricci 0001, Stefan Roth 0001, Olga Russakovsky, Torsten Sattler, Gül Varol, editors, Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part LXXIX. Volume 15137 of Lecture Notes in Computer Science, pages 198-215, Springer, 2024. [doi]

Abstract

Abstract is missing.