FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions

Noam Rotstein, David Bensaïd, Shaked Brody, Roy Ganz, Ron Kimmel. FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2024, Waikoloa, HI, USA, January 3-8, 2024. pages 5677-5688, IEEE, 2024. [doi]

Abstract

Abstract is missing.