Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence

Peter A. Heeman, Jordan Fryer, Rebecca Lunsford, Andrew Rueckert, Ethan Selfridge. Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence. In INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012. pages 747-750, ISCA, 2012. [doi]

@inproceedings{HeemanFLRS12,
  title = {Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence},
  author = {Peter A. Heeman and Jordan Fryer and Rebecca Lunsford and Andrew Rueckert and Ethan Selfridge},
  year = {2012},
  url = {http://interspeech2012.org/accepted-abstract.html?id=1423},
  researchr = {https://researchr.org/publication/HeemanFLRS12},
  cites = {0},
  citedby = {0},
  pages = {747-750},
  booktitle = {INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012},
  publisher = {ISCA},
}