Human Alignment of Large Language Models through Online Preference Optimisation - researchr publication

researchr

You are not signed in
Sign in
Sign up

Daniele Calandriello, Zhaohan Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu 0002, Rishabh Joshi, Zeyu Zheng, Bilal Piot. Human Alignment of Large Language Models through Online Preference Optimisation. In Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. OpenReview.net, 2024. [doi]

Abstract is missing.

runs on WebDSL