How Many Layers and Why? An Analysis of the Model Depth in Transformers

Antoine Simoulin, Benoît Crabbé. How Many Layers and Why? An Analysis of the Model Depth in Transformers. In Jad Kabbara, Haitao Lin, Amandalynne Paullada, Jannis Vamvas, editors, Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, ACL 2021, Online, JUli 5-10, 2021. pages 221-228, Association for Computational Linguistics, 2021. [doi]

@inproceedings{SimoulinC21-1,
  title = {How Many Layers and Why? An Analysis of the Model Depth in Transformers},
  author = {Antoine Simoulin and Benoît Crabbé},
  year = {2021},
  url = {https://aclanthology.org/2021.acl-srw.23},
  researchr = {https://researchr.org/publication/SimoulinC21-1},
  cites = {0},
  citedby = {0},
  pages = {221-228},
  booktitle = {Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, ACL 2021, Online, JUli 5-10, 2021},
  editor = {Jad Kabbara and Haitao Lin and Amandalynne Paullada and Jannis Vamvas},
  publisher = {Association for Computational Linguistics},
  isbn = {978-1-952148-03-3},
}