Confronting Reward Model Overoptimization with Constrained RLHF - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen Marcus McAleer. Confronting Reward Model Overoptimization with Constrained RLHF. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. [doi]

This author has not been identified. Look up 'Ted Moskovitz' in GoogleThis author has not been identified. Look up 'Aaditya K. Singh' in GoogleThis author has not been identified. Look up 'DJ Strouse' in GoogleThis author has not been identified. Look up 'Tuomas Sandholm' in GoogleThis author has not been identified. Look up 'Ruslan Salakhutdinov' in GoogleThis author has not been identified. Look up 'Anca D. Dragan' in GoogleThis author has not been identified. Look up 'Stephen Marcus McAleer' in Google

runs on WebDSL