Analysis of Vocabulary and Subword Tokenization Settings for Optimal Fine-tuning of MT: A Case Study of In-domain Translation

Javad PourMostafa Roshan Sharami, Dimitar Shterionov, Pieter Spronck. Analysis of Vocabulary and Subword Tokenization Settings for Optimal Fine-tuning of MT: A Case Study of In-domain Translation. In Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov, editors, Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, RANLP 2025, Varna, Bulgaria, September 8-10, 2025. pages 970-979, INCOMA Ltd., Shoumen, Bulgaria, 2025. [doi]

Authors

Javad PourMostafa Roshan Sharami

This author has not been identified. Look up 'Javad PourMostafa Roshan Sharami' in Google

Dimitar Shterionov

This author has not been identified. Look up 'Dimitar Shterionov' in Google

Pieter Spronck

This author has not been identified. Look up 'Pieter Spronck' in Google