Disentangling Length from Quality in Direct Preference Optimization - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Ryan Park, Rafael Rafailov, Stefano Ermon, Chelsea Finn. Disentangling Length from Quality in Direct Preference Optimization. In Lun-Wei Ku, Andre Martins, Vivek Srikumar, editors, Findings of the Association for Computational Linguistics, ACL 2024, Bangkok, Thailand and virtual meeting, August 11-16, 2024. pages 4998-5017, Association for Computational Linguistics, 2024. [doi]

This author has not been identified. Look up 'Ryan Park' in GoogleThis author has not been identified. Look up 'Rafael Rafailov' in GoogleThis author has not been identified. Look up 'Stefano Ermon' in GoogleThis author has not been identified. Look up 'Chelsea Finn' in Google

runs on WebDSL