Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training

Bo Zheng, Li Dong 0004, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu 0001, Xia Song, Furu Wei. Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih, editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. pages 3203-3215, Association for Computational Linguistics, 2021. [doi]

@inproceedings{ZhengDHSCLSW21,
  title = {Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training},
  author = {Bo Zheng and Li Dong 0004 and Shaohan Huang and Saksham Singhal and Wanxiang Che and Ting Liu 0001 and Xia Song and Furu Wei},
  year = {2021},
  url = {https://aclanthology.org/2021.emnlp-main.257},
  researchr = {https://researchr.org/publication/ZhengDHSCLSW21},
  cites = {0},
  citedby = {0},
  pages = {3203-3215},
  booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021},
  editor = {Marie-Francine Moens and Xuanjing Huang and Lucia Specia and Scott Wen-tau Yih},
  publisher = {Association for Computational Linguistics},
}