Lipschitz Constrained Parameter Initialization for Deep Transformers

Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Jingyi Zhang. Lipschitz Constrained Parameter Initialization for Deep Transformers. In Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel R. Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. pages 397-402, Association for Computational Linguistics, 2020. [doi]

Abstract

Abstract is missing.