DORB: Dynamically Optimizing Multiple Rewards with Bandits - researchr publication

researchr

You are not signed in
Sign in
Sign up

Ramakanth Pasunuru, Han Guo, Mohit Bansal. DORB: Dynamically Optimizing Multiple Rewards with Bandits. In Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020. pages 7766-7780, Association for Computational Linguistics, 2020. [doi]

Abstract is missing.

runs on WebDSL