Patient Knowledge Distillation for BERT Model Compression

Siqi Sun, Yu Cheng, Zhe Gan, Jingjing Liu. Patient Knowledge Distillation for BERT Model Compression. In Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan 0001, editors, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. pages 4322-4331, Association for Computational Linguistics, 2019. [doi]

Authors

Siqi Sun

This author has not been identified. Look up 'Siqi Sun' in Google

Yu Cheng

This author has not been identified. Look up 'Yu Cheng' in Google

Zhe Gan

This author has not been identified. Look up 'Zhe Gan' in Google

Jingjing Liu

This author has not been identified. Look up 'Jingjing Liu' in Google