Biomedical Language Models are Robust to Sub-optimal Tokenization

Bernal Jimenez Gutierrez, Huan Sun 0001, Yu Su 0001. Biomedical Language Models are Robust to Sub-optimal Tokenization. In Dina Demner-Fushman, Sophia Ananiadou, Kevin Cohen 0001, editors, The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP@ACL 2023, Toronto, Canada, 13 July 2023. pages 350-362, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.