Sunghwan Kim, Dongjin Kang, Taeyoon Kwon, Hyungjoo Chae, Dongha Lee 0003, Jinyoung Yeo. Rethinking Reward Model Evaluation Through the Lens of Reward Overoptimization. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar, editors, Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2025, Vienna, Austria, July 27 - August 1, 2025. pages 13252-13280, Association for Computational Linguistics, 2025. [doi]
Abstract is missing.