A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models

Takuma Udagawa, Aashka Trivedi, Michele Merler, Bishwaranjan Bhattacharjee. A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models. In Mingxuan Wang, Imed Zitouni, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023 - Industry Track, Singapore, December 6-10, 2023. pages 20-31, Association for Computational Linguistics, 2023. [doi]

Authors

Takuma Udagawa

This author has not been identified. Look up 'Takuma Udagawa' in Google

Aashka Trivedi

This author has not been identified. Look up 'Aashka Trivedi' in Google

Michele Merler

This author has not been identified. Look up 'Michele Merler' in Google

Bishwaranjan Bhattacharjee

This author has not been identified. Look up 'Bishwaranjan Bhattacharjee' in Google