End-to-End Text-to-Speech Using Latent Duration Based on VQ-VAE

Yusuke Yasuda, Xin Wang, Junichi Yamagishi. End-to-End Text-to-Speech Using Latent Duration Based on VQ-VAE. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021. pages 5694-5698, IEEE, 2021. [doi]

Abstract

Abstract is missing.