DRIFT: Difference-Aware Reinforcement Through Iterative Fine-Tuning for Language Model - researchr publication

researchr

You are not signed in
Sign in
Sign up

Wenjie Liao, Xiaohui Song, Haonan Lu. DRIFT: Difference-Aware Reinforcement Through Iterative Fine-Tuning for Language Model. In Sven Koenig, Chad Jenkins, Matthew E. Taylor, editors, Fortieth AAAI Conference on Artificial Intelligence, Thirty-Eighth Conference on Innovative Applications of Artificial Intelligence, Sixteenth Symposium on Educational Advances in Artificial Intelligence, AAAI 2026, Singapore, January 20-27, 2026. pages 31988-31996, AAAI Press, 2026. [doi]

Abstract is missing.

runs on WebDSL