Lifting the Curse of Capacity Gap in Distilling Language Models

Chen Zhang, Yang Yang, Jiahao Liu, Jingang Wang, Yunsen Xian, Benyou Wang, Dawei Song. Lifting the Curse of Capacity Gap in Distilling Language Models. In Anna Rogers, Jordan L. Boyd-Graber, Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023. pages 4535-4553, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.