Efficient Reinforcement Learning from Human Feedback via Bayesian preference inference

Matteo Cercola, Valeria Capretti, Simone Formentin. Efficient Reinforcement Learning from Human Feedback via Bayesian preference inference. IFAC J. Syst. Control., 35:100398, 2026. [doi]

Abstract

Abstract is missing.