Reward estimation with scheduled knowledge distillation for dialogue policy learning

Junyan Qiu, Haidong Zhang, Yiping Yang. Reward estimation with scheduled knowledge distillation for dialogue policy learning. Connect. Sci., 35(1), December 2023. [doi]

Abstract

Abstract is missing.