How Many Layers and Why? An Analysis of the Model Depth in Transformers

Antoine Simoulin, Benoît Crabbé. How Many Layers and Why? An Analysis of the Model Depth in Transformers. In Jad Kabbara, Haitao Lin, Amandalynne Paullada, Jannis Vamvas, editors, Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, ACL 2021, Online, JUli 5-10, 2021. pages 221-228, Association for Computational Linguistics, 2021. [doi]

Abstract

Abstract is missing.