CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language Models

Fuwen Luo, Chi Chen 0005, Zihao Wan, Zhaolu Kang, Qidong Yan, Yingjie Li, Xiaolong Wang, Siyu Wang, Ziyue Wang, Xiaoyue Mi, Peng Li 0030, Ning Ma, Maosong Sun 0001, Yang Liu 0005. CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language Models. In Lun-Wei Ku, Andre Martins, Vivek Srikumar, editors, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024. pages 10639-10659, Association for Computational Linguistics, 2024. [doi]

Abstract

Abstract is missing.