Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis

Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu 0001, Yanyao Bian, Dan Su 0002, Helen Meng. Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis. In Hanseok Ko, John H. L. Hansen, editors, Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, Korea, 18-22 September 2022. pages 2573-2577, ISCA, 2022. [doi]

Abstract

Abstract is missing.