Biomedical Language Models are Robust to Sub-optimal Tokenization

Bernal Jimenez Gutierrez, Huan Sun 0001, Yu Su 0001. Biomedical Language Models are Robust to Sub-optimal Tokenization. In Dina Demner-Fushman, Sophia Ananiadou, Kevin Cohen 0001, editors, The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP@ACL 2023, Toronto, Canada, 13 July 2023. pages 350-362, Association for Computational Linguistics, 2023. [doi]

@inproceedings{Gutierrez0023,
  title = {Biomedical Language Models are Robust to Sub-optimal Tokenization},
  author = {Bernal Jimenez Gutierrez and Huan Sun 0001 and Yu Su 0001},
  year = {2023},
  url = {https://aclanthology.org/2023.bionlp-1.32},
  researchr = {https://researchr.org/publication/Gutierrez0023},
  cites = {0},
  citedby = {0},
  pages = {350-362},
  booktitle = {The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP@ACL 2023, Toronto, Canada, 13 July 2023},
  editor = {Dina Demner-Fushman and Sophia Ananiadou and Kevin Cohen 0001},
  publisher = {Association for Computational Linguistics},
}