Learning Dialogue Policy Efficiently Through Dyna Proximal Policy Optimization

Chenping Huang, Bin Cao. Learning Dialogue Policy Efficiently Through Dyna Proximal Policy Optimization. In Honghao Gao, Xinheng Wang 0001, Wei Wei, Tasos Dagiuklas, editors, Collaborative Computing: Networking, Applications and Worksharing - 18th EAI International Conference, CollaborateCom 2022, Hangzhou, China, October 15-16, 2022, Proceedings, Part I. Volume 460 of Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, pages 396-414, Springer, 2022. [doi]

Abstract

Abstract is missing.