Joint Dual Feature Distillation and Gradient Progressive Pruning for BERT compression

Zhou Zhang, Yang Lu 0015, Tengfei Wang, Xing Wei 0002, Zhen Wei. Joint Dual Feature Distillation and Gradient Progressive Pruning for BERT compression. Neural Networks, 179:106533, 2024. [doi]

Abstract

Abstract is missing.