Confronting Reward Model Overoptimization with Constrained RLHF - researchr publication references

researchr

You are not signed in
Sign in
Sign up

Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen Marcus McAleer. Confronting Reward Model Overoptimization with Constrained RLHF. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. [doi]

No references recorded for this publication.

No citations of this publication recorded.

runs on WebDSL