A Policy Iteration Algorithm for Learning from Preference-Based Feedback

Christian Wirth, Johannes Fürnkranz. A Policy Iteration Algorithm for Learning from Preference-Based Feedback. In Allan Tucker, Frank Höppner, Arno Siebes, Stephen Swift, editors, Advances in Intelligent Data Analysis XII - 12th International Symposium, IDA 2013, London, UK, October 17-19, 2013. Proceedings. Volume 8207 of Lecture Notes in Computer Science, pages 427-437, Springer, 2013. [doi]

@inproceedings{WirthF13,
  title = {A Policy Iteration Algorithm for Learning from Preference-Based Feedback},
  author = {Christian Wirth and Johannes Fürnkranz},
  year = {2013},
  doi = {10.1007/978-3-642-41398-8_37},
  url = {http://dx.doi.org/10.1007/978-3-642-41398-8_37},
  researchr = {https://researchr.org/publication/WirthF13},
  cites = {0},
  citedby = {0},
  pages = {427-437},
  booktitle = {Advances in Intelligent Data Analysis XII - 12th International Symposium, IDA 2013, London, UK, October 17-19, 2013. Proceedings},
  editor = {Allan Tucker and Frank Höppner and Arno Siebes and Stephen Swift},
  volume = {8207},
  series = {Lecture Notes in Computer Science},
  publisher = {Springer},
  isbn = {978-3-642-41397-1},
}