Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence

Peter A. Heeman, Jordan Fryer, Rebecca Lunsford, Andrew Rueckert, Ethan Selfridge. Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence. In INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012. pages 747-750, ISCA, 2012. [doi]