Lifting the Curse of Capacity Gap in Distilling Language Models

Chen Zhang, Yang Yang, Jiahao Liu, Jingang Wang, Yunsen Xian, Benyou Wang, Dawei Song. Lifting the Curse of Capacity Gap in Distilling Language Models. In Anna Rogers, Jordan L. Boyd-Graber, Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023. pages 4535-4553, Association for Computational Linguistics, 2023. [doi]

Authors

Chen Zhang

This author has not been identified. Look up 'Chen Zhang' in Google

Yang Yang

This author has not been identified. It may be one of the following persons: Look up 'Yang Yang' in Google

Jiahao Liu

This author has not been identified. Look up 'Jiahao Liu' in Google

Jingang Wang

This author has not been identified. Look up 'Jingang Wang' in Google

Yunsen Xian

This author has not been identified. Look up 'Yunsen Xian' in Google

Benyou Wang

This author has not been identified. Look up 'Benyou Wang' in Google

Dawei Song

This author has not been identified. Look up 'Dawei Song' in Google