Lessons on Parameter Sharing across Layers in Transformers

Sho Takase, Shun Kiyono. Lessons on Parameter Sharing across Layers in Transformers. In Nafise Sadat Moosavi, Iryna Gurevych, Yufang Hou 0001, Gyuwan Kim, Young-Jin Kim, Tal Schuster, Ameeta Agrawal, editors, Proceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2023, Toronto, Canada (Hybrid), July 13, 2023. pages 78-90, Association for Computational Linguistics, 2023. [doi]

Authors

Sho Takase

This author has not been identified. Look up 'Sho Takase' in Google

Shun Kiyono

This author has not been identified. Look up 'Shun Kiyono' in Google