Bootstrapping Language Models with DPO Implicit Rewards - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Changyu Chen, Zichen Liu, Chao Du, Tianyu Pang, Qian Liu 0012, Arunesh Sinha, Pradeep Varakantham, Min Lin. Bootstrapping Language Models with DPO Implicit Rewards. In The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [doi]

This author has not been identified. Look up 'Changyu Chen' in GoogleThis author has not been identified. Look up 'Zichen Liu' in GoogleThis author has not been identified. Look up 'Chao Du' in GoogleThis author has not been identified. Look up 'Tianyu Pang' in GoogleThis author has not been identified. Look up 'Qian Liu 0012' in GoogleThis author has not been identified. Look up 'Arunesh Sinha' in GoogleThis author has not been identified. Look up 'Pradeep Varakantham' in GoogleThis author has not been identified. Look up 'Min Lin' in Google

runs on WebDSL