Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks

Julia Kreutzer, Stefan Riezler, Carolin Lawrence. Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks. In Zornitsa Kozareva, Sujith Ravi, Andreas Vlachos 0001, Priyanka Agrawal, André F. T. Martins, editors, Proceedings of the 5th Workshop on Structured Prediction for NLP, SPNLP@ACL-IJCNLP 2021, Online, August 6, 2021. pages 37-43, Association for Computational Linguistics, 2021. [doi]

Abstract

Abstract is missing.