ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition

Thai Binh Nguyen, Thi-Van Nguyen, Quoc Truong Do, Chi Mai Luong. ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition. In Odette Scharenborg, Catharine Oertel, Khiet Truong, editors, 26th Annual Conference of the International Speech Communication Association, Interspeech 2025, Rotterdam, The Netherlands, 17-21 August 2025. ISCA, 2025. [doi]

Abstract

Abstract is missing.