Capturing Cross-Modal Semantics by Generating Comments for Image-Text Contents

Shun Qian, Bingquan Liu, Chengjie Sun, Zhen Xu 0003, Baoxun Wang. Capturing Cross-Modal Semantics by Generating Comments for Image-Text Contents. In Josef Kittler, Hongkai Xiong, Jian Yang 0003, Xilin Chen 0001, Jiwen Lu, Weiyao Lin, Jingyi Yu 0002, Weishi Zheng 0001, editors, Pattern Recognition and Computer Vision - 8th Chinese Conference, PRCV 2025, Shanghai, China, October 15-18, 2025, Proceedings, Part VI. Volume 16277 of Lecture Notes in Computer Science, pages 135-148, Springer, 2025. [doi]

Abstract

Abstract is missing.