A Policy Iteration Algorithm for Learning from Preference-Based Feedback

Christian Wirth, Johannes Fürnkranz. A Policy Iteration Algorithm for Learning from Preference-Based Feedback. In Allan Tucker, Frank Höppner, Arno Siebes, Stephen Swift, editors, Advances in Intelligent Data Analysis XII - 12th International Symposium, IDA 2013, London, UK, October 17-19, 2013. Proceedings. Volume 8207 of Lecture Notes in Computer Science, pages 427-437, Springer, 2013. [doi]

Abstract

Abstract is missing.