Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders

Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai. Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders. In Gernot Kubin, Zdravko Kacic, editors, Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019. pages 1308-1312, ISCA, 2019. [doi]

Abstract

Abstract is missing.