The following publications are possibly variants of this publication:
- Explanation Guided Knowledge Distillation for Pre-trained Language Model CompressionZhao Yang, Yuanzhe Zhang, Dianbo Sui, Yiming Ju, Jun Zhao 0001, Kang Liu 0001. talip, 23(2), February 2024. [doi]
- ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language ModelsJianyi Zhang, Aashiq Muhamed, Aditya Anantharaman, Guoyin Wang 0002, Changyou Chen, Kai Zhong, Qingjun Cui, Yi Xu, Belinda Zeng, Trishul Chilimbi, Yiran Chen 0001. acl 2023: 1128-1136 [doi]
- KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge DistillationMarzieh S. Tahaei, Ella Charlaix, Vahid Partovi Nia, Ali Ghodsi 0001, Mehdi Rezagholizadeh. naacl 2022: 2116-2127 [doi]
- PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge DistillationWei Zhu, Xiaofeng Zhou, Keqiang Wang, Xun Luo, Xiepeng Li, Yuan Ni, Guotong Xie. bionlp 2019: 380-388 [doi]