Batch Policy Gradient Methods for Improving Neural Conversation Models

Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter. Batch Policy Gradient Methods for Improving Neural Conversation Models. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017. [doi]

Authors

Kirthevasan Kandasamy

This author has not been identified. Look up 'Kirthevasan Kandasamy' in Google

Yoram Bachrach

This author has not been identified. Look up 'Yoram Bachrach' in Google

Ryota Tomioka

This author has not been identified. Look up 'Ryota Tomioka' in Google

Daniel Tarlow

This author has not been identified. Look up 'Daniel Tarlow' in Google

David Carter

This author has not been identified. Look up 'David Carter' in Google