Natural actor and belief critic: Reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs - researchr publication related

researchr

You are not signed in
Sign in
Sign up

Filip Jurcícek, Blaise Thomson, Steve Young. Natural actor and belief critic: Reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs. TSLP, 7(3):6, 2011. [doi]

The following publications are possibly variants of this publication:

Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systemsFilip Jurcícek, Blaise Thomson, Simon Keizer, François Mairesse, Milica Gasic, Kai Yu, Steve Young. interspeech 2010: 90-93 [doi]

Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue ManagementPei-hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve J. Young. sigdial 2017: 147-157 [doi]

runs on WebDSL