Improving Naturalness and Controllability of Sequence-to-Sequence Speech Synthesis by Learning Local Prosody Representations

Cheng Gong, Longbiao Wang, Zhenhua Ling, Shaotong Guo, Ju Zhang 0001, Jianwu Dang. Improving Naturalness and Controllability of Sequence-to-Sequence Speech Synthesis by Learning Local Prosody Representations. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021. pages 5724-5728, IEEE, 2021. [doi]

Abstract

Abstract is missing.