Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training

Bo Zheng, Li Dong 0004, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu 0001, Xia Song, Furu Wei. Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih, editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. pages 3203-3215, Association for Computational Linguistics, 2021. [doi]

Authors

Bo Zheng

This author has not been identified. Look up 'Bo Zheng' in Google

Li Dong 0004

This author has not been identified. Look up 'Li Dong 0004' in Google

Shaohan Huang

This author has not been identified. Look up 'Shaohan Huang' in Google

Saksham Singhal

This author has not been identified. Look up 'Saksham Singhal' in Google

Wanxiang Che

This author has not been identified. Look up 'Wanxiang Che' in Google

Ting Liu 0001

This author has not been identified. Look up 'Ting Liu 0001' in Google

Xia Song

This author has not been identified. Look up 'Xia Song' in Google

Furu Wei

This author has not been identified. Look up 'Furu Wei' in Google