Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Chen Liang, Simiao Zuo, Qingru Zhang, Pengcheng He, Weizhu Chen, Tuo Zhao. Less is More: Task-aware Layer-wise Distillation for Language Model Compression. In Andreas Krause 0001, Emma Brunskill, KyungHyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA. Volume 202 of Proceedings of Machine Learning Research, pages 20852-20867, PMLR, 2023. [doi]

@inproceedings{LiangZZHCZ23,
  title = {Less is More: Task-aware Layer-wise Distillation for Language Model Compression},
  author = {Chen Liang and Simiao Zuo and Qingru Zhang and Pengcheng He and Weizhu Chen and Tuo Zhao},
  year = {2023},
  url = {https://proceedings.mlr.press/v202/liang23j.html},
  researchr = {https://researchr.org/publication/LiangZZHCZ23},
  cites = {0},
  citedby = {0},
  pages = {20852-20867},
  booktitle = {International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA},
  editor = {Andreas Krause 0001 and Emma Brunskill and KyungHyun Cho and Barbara Engelhardt and Sivan Sabato and Jonathan Scarlett},
  volume = {202},
  series = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
}