Less is More: Task-aware Layer-wise Distillation for Language Model Compression

researchr

You are not signed in
Sign in
Sign up

Chen Liang, Simiao Zuo, Qingru Zhang, Pengcheng He, Weizhu Chen, Tuo Zhao. Less is More: Task-aware Layer-wise Distillation for Language Model Compression. In Andreas Krause 0001, Emma Brunskill, KyungHyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA. Volume 202 of Proceedings of Machine Learning Research, pages 20852-20867, PMLR, 2023. [doi]

@inproceedings{LiangZZHCZ23,
  title = {Less is More: Task-aware Layer-wise Distillation for Language Model Compression},
  author = {Chen Liang and Simiao Zuo and Qingru Zhang and Pengcheng He and Weizhu Chen and Tuo Zhao},
  year = {2023},
  url = {https://proceedings.mlr.press/v202/liang23j.html},
  researchr = {https://researchr.org/publication/LiangZZHCZ23},
  cites = {0},
  citedby = {0},
  pages = {20852-20867},
  booktitle = {International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA},
  editor = {Andreas Krause 0001 and Emma Brunskill and KyungHyun Cho and Barbara Engelhardt and Sivan Sabato and Jonathan Scarlett},
  volume = {202},
  series = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
}

External Links

Cite Key

Statistics

PDF

Researchr

Less is More: Task-aware Layer-wise Distillation for Language Model Compression