OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning

Zhenyu Bi, Meng Lu, Yang Li, Swastik Roy, Weijie Guan, Morteza Ziyadi, Xuan Wang 0008. OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning. In Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty 0002, Dhirendra Pratap Singh, editors, Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, IJCNLP-AACL 2025, Mumbai, India, December 20-24, 2025. pages 1713-1728, The Asian Federation of Natural Language Processing and The Association for Computational Linguistics, 2025. [doi]

Authors

Zhenyu Bi

This author has not been identified. Look up 'Zhenyu Bi' in Google

Meng Lu

This author has not been identified. Look up 'Meng Lu' in Google

Yang Li

This author has not been identified. Look up 'Yang Li' in Google

Swastik Roy

This author has not been identified. Look up 'Swastik Roy' in Google

Weijie Guan

This author has not been identified. Look up 'Weijie Guan' in Google

Morteza Ziyadi

This author has not been identified. Look up 'Morteza Ziyadi' in Google

Xuan Wang 0008

This author has not been identified. Look up 'Xuan Wang 0008' in Google