Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation

Yuhang Zhou, Jing Zhu 0005, Paiheng Xu, Xiaoyu Liu 0003, Xiyao Wang, Danai Koutra, Wei Ai 0002, Furong Huang. Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation. In Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen, editors, Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA, November 12-16, 2024. pages 3315-3333, Association for Computational Linguistics, 2024. [doi]

Abstract

Abstract is missing.