Dialog policy optimization for low resource setting using Self-play and Reward based Sampling - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Tharindu Madusanka, Durashi Langappuli, Thisara Welmilla, Uthayasanker Thayasivam, Sanath Jayasena. Dialog policy optimization for low resource setting using Self-play and Reward based Sampling. In Minh Le Nguyen, Mai Chi Luong, Sanghoun Song, editors, Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation, PACLIC 2020, Hanoi, Vietnam, October 24-26, 2020. pages 178-187, Association for Computational Linguistics, 2020. [doi]

This author has not been identified. Look up 'Tharindu Madusanka' in GoogleThis author has not been identified. Look up 'Durashi Langappuli' in GoogleThis author has not been identified. Look up 'Thisara Welmilla' in GoogleThis author has not been identified. Look up 'Uthayasanker Thayasivam' in GoogleThis author has not been identified. Look up 'Sanath Jayasena' in Google

runs on WebDSL