Le Hou, Richard Yuanzhe Pang, Tianyi Zhou, Yuexin Wu, Xinying Song, Xiaodan Song, Denny Zhou. Token Dropping for Efficient BERT Pretraining. In Smaranda Muresan, Preslav Nakov, Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. pages 3774-3784, Association for Computational Linguistics, 2022. [doi]
Abstract is missing.