Zhiwei He 0002, Xing Wang 0007, Wenxiang Jiao, Zhuosheng Zhang 0001, Rui Wang 0015, Shuming Shi 0001, Zhaopeng Tu. Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model. In Kevin Duh, Helena Gómez-Adorno, Steven Bethard, editors, Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), NAACL 2024, Mexico City, Mexico, June 16-21, 2024. pages 8164-8180, Association for Computational Linguistics, 2024. [doi]
Abstract is missing.