Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models

Muhan Lin, Shuyang Shi, Yue Guo 0003, Behdad Chalaki, Vaishnav Tadiparthi, Ehsan Moradi-Pari, Simon Stepputtis, Joseph Campbell, Katia P. Sycara. Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models. In Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen, editors, Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA, November 12-16, 2024. pages 16002-16014, Association for Computational Linguistics, 2024. [doi]

Authors

Muhan Lin

This author has not been identified. Look up 'Muhan Lin' in Google

Shuyang Shi

This author has not been identified. Look up 'Shuyang Shi' in Google

Yue Guo 0003

This author has not been identified. Look up 'Yue Guo 0003' in Google

Behdad Chalaki

This author has not been identified. Look up 'Behdad Chalaki' in Google

Vaishnav Tadiparthi

This author has not been identified. Look up 'Vaishnav Tadiparthi' in Google

Ehsan Moradi-Pari

This author has not been identified. Look up 'Ehsan Moradi-Pari' in Google

Simon Stepputtis

This author has not been identified. Look up 'Simon Stepputtis' in Google

Joseph Campbell

This author has not been identified. Look up 'Joseph Campbell' in Google

Katia P. Sycara

This author has not been identified. Look up 'Katia P. Sycara' in Google