Transformer Feed-Forward Layers Are Key-Value Memories

Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy. Transformer Feed-Forward Layers Are Key-Value Memories. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih, editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. pages 5484-5495, Association for Computational Linguistics, 2021. [doi]

Authors

Mor Geva

This author has not been identified. Look up 'Mor Geva' in Google

Roei Schuster

This author has not been identified. Look up 'Roei Schuster' in Google

Jonathan Berant

This author has not been identified. Look up 'Jonathan Berant' in Google

Omer Levy

This author has not been identified. Look up 'Omer Levy' in Google