BERT is Not an Interlingua and the Bias of Tokenization

Jasdeep Singh, Bryan McCann, Richard Socher, Caiming Xiong. BERT is Not an Interlingua and the Bias of Tokenization. In Colin Cherry, Greg Durrett, George F. Foster, Reza Haffari, Shahram Khadivi, Nanyun Peng, Xiang Ren, Swabha Swayamdipta, editors, Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP, DeepLo@EMNLP-IJCNLP 2019, Hong Kong, China, November 3, 2019. pages 47-55, Association for Computational Linguistics, 2019. [doi]

Abstract

Abstract is missing.