VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Yifan Peng, Krishna C. Puvvada, Zhehuai Chen, Piotr Zelasko, He Huang 0012, Kunal Dhawan, Ke Hu, Shinji Watanabe 0001, Jagadeesh Balam, Boris Ginsburg. VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning. In Luis Chiruzzo, Alan Ritter, Lu Wang, editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025 - Volume 1: Long Papers, Albuquerque, New Mexico, USA, April 29 - May 4, 2025. pages 5787-5802, Association for Computational Linguistics, 2025. [doi]

This author has not been identified. Look up 'Yifan Peng' in GoogleThis author has not been identified. Look up 'Krishna C. Puvvada' in GoogleThis author has not been identified. Look up 'Zhehuai Chen' in GoogleThis author has not been identified. Look up 'Piotr Zelasko' in GoogleThis author has not been identified. Look up 'He Huang 0012' in GoogleThis author has not been identified. Look up 'Kunal Dhawan' in GoogleThis author has not been identified. Look up 'Ke Hu' in GoogleThis author has not been identified. Look up 'Shinji Watanabe 0001' in GoogleThis author has not been identified. Look up 'Jagadeesh Balam' in GoogleThis author has not been identified. Look up 'Boris Ginsburg' in Google

runs on WebDSL