Extremely Small BERT Models from Mixed-Vocabulary Training

Sanqiang Zhao, Raghav Gupta, Yang Song, Denny Zhou. Extremely Small BERT Models from Mixed-Vocabulary Training. In Paola Merlo, Jörg Tiedemann, Reut Tsarfaty, editors, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 - 23, 2021. pages 2753-2759, Association for Computational Linguistics, 2021. [doi]

Abstract

Abstract is missing.