How to Evaluate Reward Models for RLHF

Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios Nikolas Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica. How to Evaluate Reward Models for RLHF. In The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [doi]

Authors

Evan Frick

This author has not been identified. Look up 'Evan Frick' in Google

Tianle Li

This author has not been identified. Look up 'Tianle Li' in Google

Connor Chen

This author has not been identified. Look up 'Connor Chen' in Google

Wei-Lin Chiang

This author has not been identified. Look up 'Wei-Lin Chiang' in Google

Anastasios Nikolas Angelopoulos

This author has not been identified. Look up 'Anastasios Nikolas Angelopoulos' in Google

Jiantao Jiao

This author has not been identified. Look up 'Jiantao Jiao' in Google

Banghua Zhu

This author has not been identified. Look up 'Banghua Zhu' in Google

Joseph E. Gonzalez

This author has not been identified. Look up 'Joseph E. Gonzalez' in Google

Ion Stoica

This author has not been identified. Look up 'Ion Stoica' in Google