VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion

Disong Wang, Shan Yang, Dan Su 0002, Xunying Liu, Dong Yu 0001, Helen Meng. VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022, Virtual and Singapore, 23-27 May 2022. pages 7252-7256, IEEE, 2022. [doi]

Abstract

Abstract is missing.