Reward estimation with scheduled knowledge distillation for dialogue policy learning

Junyan Qiu, Haidong Zhang, Yiping Yang. Reward estimation with scheduled knowledge distillation for dialogue policy learning. Connect. Sci., 35(1), December 2023. [doi]

No reviews for this publication, yet.