Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback

Khanh Nguyen, Hal Daumé III, Jordan L. Boyd-Graber. Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback. In Martha Palmer, Rebecca Hwa, Sebastian Riedel, editors, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017. pages 1465-1475, Association for Computational Linguistics, 2017. [doi]

Authors

Khanh Nguyen

This author has not been identified. Look up 'Khanh Nguyen' in Google

Hal Daumé III

This author has not been identified. Look up 'Hal Daumé III' in Google

Jordan L. Boyd-Graber

This author has not been identified. Look up 'Jordan L. Boyd-Graber' in Google