Yicheng Chen, Yining Li, Kai Hu, Zerun Ma, Haochen Ye, Kai Chen. MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar, editors, Findings of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1, 2025. pages 9902-9915, Association for Computational Linguistics, 2025. [doi]
Abstract is missing.