Geometrically-Aware Dual Transformer Encoding Visual and Textual Features for Image Captioning

Yu-Ling Chang, Hao-Shang Ma, Shiou-Chi Li, Jen-Wei Huang. Geometrically-Aware Dual Transformer Encoding Visual and Textual Features for Image Captioning. In De-Nian Yang, Xing Xie 0001, Vincent S. Tseng, Jian Pei, Jen-Wei Huang, Jerry Chun-Wei Lin, editors, Advances in Knowledge Discovery and Data Mining - 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024, Taipei, Taiwan, May 7-10, 2024, Proceedings, Part V. Volume 14649 of Lecture Notes in Computer Science, pages 15-27, Springer, 2024. [doi]

Abstract

Abstract is missing.