Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws - researchr publication

researchr

You are not signed in
Sign in
Sign up

Kush Bhatia, Wenshuo Guo, Jacob Steinhardt. Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws. In Francisco J. R. Ruiz, Jennifer G. Dy, Jan-Willem van de Meent, editors, International Conference on Artificial Intelligence and Statistics, 25-27 April 2023, Palau de Congressos, Valencia, Spain. Volume 206 of Proceedings of Machine Learning Research, pages 11149-11171, PMLR, 2023. [doi]

Abstract is missing.

runs on WebDSL