Long-Khanh Pham, Thanh V. T. Tran, Minh-Tan Pham, Van Nguyen. RESOUND: Speech Reconstruction from Silent Videos via Acoustic-Semantic Decomposed Modeling. In Odette Scharenborg, Catharine Oertel, Khiet Truong, editors, 26th Annual Conference of the International Speech Communication Association, Interspeech 2025, Rotterdam, The Netherlands, 17-21 August 2025. ISCA, 2025. [doi]
Abstract is missing.