MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria

Wentao Ge, Shunian Chen, Hardy Chen, Nuo Chen 0002, Junying Chen, Zhihong Chen, Wenya Xie, Shuo Yan, ChenghaoZhu ChenghaoZhu, Ziyue Lin, Dingjie Song, Xidong Wang, Anningzhe Gao, Zhiyi Zhang 0007, Jianquan Li, Xiang Wan, Benyou Wang. MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria. In Luis Chiruzzo, Alan Ritter, Lu Wang, editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025 - Volume 1: Long Papers, Albuquerque, New Mexico, USA, April 29 - May 4, 2025. pages 4951-4974, Association for Computational Linguistics, 2025. [doi]

Abstract

Abstract is missing.