CLIP-based image captioning via unsupervised cycle-consistency in the latent space

Romain Bielawski, Rufin VanRullen. CLIP-based image captioning via unsupervised cycle-consistency in the latent space. In Burcu Can, Maximilian Mozes, Samuel Cahyawijaya, Naomi Saphra, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Chen Zhao, Isabelle Augenstein, Anna Rogers, KyungHyun Cho, Edward Grefenstette, Lena Voita, editors, Proceedings of the 8th Workshop on Representation Learning for NLP, RepL4NLP@ACL 2023, Toronto, Canada, July 13, 2023. pages 266-275, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.