VCGD: Visual Clue Guided Decoding with Caption Model for Mitigating Hallucination in Multimodal Large Language Models

Guoqing Chen, Fu Zhang 0001, Bingqian Liu, Chenglong Lu, Jingwei Cheng. VCGD: Visual Clue Guided Decoding with Caption Model for Mitigating Hallucination in Multimodal Large Language Models. In Sven Koenig, Chad Jenkins, Matthew E. Taylor, editors, Fortieth AAAI Conference on Artificial Intelligence, Thirty-Eighth Conference on Innovative Applications of Artificial Intelligence, Sixteenth Symposium on Educational Advances in Artificial Intelligence, AAAI 2026, Singapore, January 20-27, 2026. pages 20041-20049, AAAI Press, 2026. [doi]

Authors

Guoqing Chen

This author has not been identified. Look up 'Guoqing Chen' in Google

Fu Zhang 0001

This author has not been identified. Look up 'Fu Zhang 0001' in Google

Bingqian Liu

This author has not been identified. Look up 'Bingqian Liu' in Google

Chenglong Lu

This author has not been identified. Look up 'Chenglong Lu' in Google

Jingwei Cheng

This author has not been identified. Look up 'Jingwei Cheng' in Google