How Many Layers and Why? An Analysis of the Model Depth in Transformers

Antoine Simoulin, Benoît Crabbé. How Many Layers and Why? An Analysis of the Model Depth in Transformers. In Jad Kabbara, Haitao Lin, Amandalynne Paullada, Jannis Vamvas, editors, Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, ACL 2021, Online, JUli 5-10, 2021. pages 221-228, Association for Computational Linguistics, 2021. [doi]

Authors

Antoine Simoulin

This author has not been identified. Look up 'Antoine Simoulin' in Google

Benoît Crabbé

This author has not been identified. Look up 'Benoît Crabbé' in Google