Streamable Speech Representation Disentanglement and Multi-Level Prosody Modeling for Live One-Shot Voice Conversion

Haoquan Yang, Liqun Deng, Yu Ting Yeung, Nianzu Zheng, Yong Xu. Streamable Speech Representation Disentanglement and Multi-Level Prosody Modeling for Live One-Shot Voice Conversion. In Hanseok Ko, John H. L. Hansen, editors, Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, Korea, 18-22 September 2022. pages 2578-2582, ISCA, 2022. [doi]

Abstract

Abstract is missing.