Information Aggregation for Multi-Head Attention with Routing-by-Agreement

Jian Li, Baosong Yang, Zi-Yi Dou, Xing Wang, Michael R. Lyu, Zhaopeng Tu. Information Aggregation for Multi-Head Attention with Routing-by-Agreement. In Jill Burstein, Christy Doran, Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). pages 3566-3575, Association for Computational Linguistics, 2019. [doi]

Abstract

Abstract is missing.