A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models

Takuma Udagawa, Aashka Trivedi, Michele Merler, Bishwaranjan Bhattacharjee. A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models. In Mingxuan Wang, Imed Zitouni, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023 - Industry Track, Singapore, December 6-10, 2023. pages 20-31, Association for Computational Linguistics, 2023. [doi]

@inproceedings{UdagawaTMB23,
  title = {A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models},
  author = {Takuma Udagawa and Aashka Trivedi and Michele Merler and Bishwaranjan Bhattacharjee},
  year = {2023},
  url = {https://aclanthology.org/2023.emnlp-industry.3},
  researchr = {https://researchr.org/publication/UdagawaTMB23},
  cites = {0},
  citedby = {0},
  pages = {20-31},
  booktitle = {Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023 - Industry Track, Singapore, December 6-10, 2023},
  editor = {Mingxuan Wang and Imed Zitouni},
  publisher = {Association for Computational Linguistics},
}