Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders

Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai. Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders. In Gernot Kubin, Zdravko Kacic, editors, Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019. pages 1308-1312, ISCA, 2019. [doi]

@inproceedings{OkamotoTSK19-1,
  title = {Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders},
  author = {Takuma Okamoto and Tomoki Toda and Yoshinori Shiga and Hisashi Kawai},
  year = {2019},
  doi = {10.21437/Interspeech.2019-1288},
  url = {https://doi.org/10.21437/Interspeech.2019-1288},
  researchr = {https://researchr.org/publication/OkamotoTSK19-1},
  cites = {0},
  citedby = {0},
  pages = {1308-1312},
  booktitle = {Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019},
  editor = {Gernot Kubin and Zdravko Kacic},
  publisher = {ISCA},
}