Analyzing the Impact of Tokenization on Multilingual Epidemic Surveillance in Low-Resource Languages

Stephen Mutuvi, Emanuela Boros, Antoine Doucet, Gaël Lejeune, Adam Jatowt, Moses Odeo. Analyzing the Impact of Tokenization on Multilingual Epidemic Surveillance in Low-Resource Languages. In Gernot A. Fink, Rajiv Jain, Koichi Kise, Richard Zanibbi, editors, Document Analysis and Recognition - ICDAR 2023 - 17th International Conference, San José, CA, USA, August 21-26, 2023, Proceedings, Part III. Volume 14189 of Lecture Notes in Computer Science, pages 17-32, Springer, 2023. [doi]

Abstract

Abstract is missing.