Lessons on Parameter Sharing across Layers in Transformers

Sho Takase, Shun Kiyono. Lessons on Parameter Sharing across Layers in Transformers. In Nafise Sadat Moosavi, Iryna Gurevych, Yufang Hou 0001, Gyuwan Kim, Young-Jin Kim, Tal Schuster, Ameeta Agrawal, editors, Proceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2023, Toronto, Canada (Hybrid), July 13, 2023. pages 78-90, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.