Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

Pei-hao Su, David Vandyke, Milica Gasic, Nikola Mrksic, Tsung-Hsien Wen, Steve J. Young. Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems. In Proceedings of the SIGDIAL 2015 Conference, The 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2-4 September 2015, Prague, Czech Republic. pages 417-421, The Association for Computer Linguistics, 2015. [doi]

Abstract

Abstract is missing.