Systematic Reward Gap Optimization for Mitigating VLM Hallucinations

Lehan He, Zeren Chen, Zhelun Shi, Tianyu Yu, Jing Shao, Lu Sheng. Systematic Reward Gap Optimization for Mitigating VLM Hallucinations. In Danielle Belgrave, Cheng Zhang 0005, Laura N. Montoya, Hsuan-Tien Lin, Razvan Pascanu, Piotr Koniusz, Marzyeh Ghassemi, Nancy Chen, Iván Vladimir Meza Ruíz, Arturo Loaiza-Bonilla, editors, Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, NeurIPS 2025, San Diago, CA, USA, December 2-7, 2025 / Mexico City, Mexico, November 30 - December 5, 2025. 2025. [doi]

Authors

Lehan He

This author has not been identified. Look up 'Lehan He' in Google

Zeren Chen

This author has not been identified. Look up 'Zeren Chen' in Google

Zhelun Shi

This author has not been identified. Look up 'Zhelun Shi' in Google

Tianyu Yu

This author has not been identified. Look up 'Tianyu Yu' in Google

Jing Shao

This author has not been identified. Look up 'Jing Shao' in Google

Lu Sheng

This author has not been identified. Look up 'Lu Sheng' in Google