XtremeDistil: Multi-stage Distillation for Massive Multilingual Models

Subhabrata Mukherjee, Ahmed Hassan Awadallah. XtremeDistil: Multi-stage Distillation for Massive Multilingual Models. In Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel R. Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. pages 2221-2234, Association for Computational Linguistics, 2020. [doi]

Abstract

Abstract is missing.