Do Diacritics Matter? Evaluating the Impact of Arabic Diacritics on Tokenization and LLM Benchmarks

Go Inoue, Bashar Alhafni, Nizar Habash, Timothy Baldwin. Do Diacritics Matter? Evaluating the Impact of Arabic Diacritics on Tokenization and LLM Benchmarks. In Vera Demberg, Kentaro Inui, LluĂ­s Marquez, editors, Findings of the Association for Computational Linguistics: EACL 2026, Rabat, Morocco, March 24-29, 2026. pages 426-442, Association for Computational Linguistics, 2026. [doi]

Abstract

Abstract is missing.