MSStyleTTS: Multi-Scale Style Modeling With Hierarchical Context Information for Expressive Speech Synthesis

Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu 0001, Xixin Wu, Shiyin Kang, Helen Meng. MSStyleTTS: Multi-Scale Style Modeling With Hierarchical Context Information for Expressive Speech Synthesis. IEEE Transactions on Audio, Speech & Language Processing, 31:3290-3303, 2023. [doi]

@article{LeiZCWWKM23,
  title = {MSStyleTTS: Multi-Scale Style Modeling With Hierarchical Context Information for Expressive Speech Synthesis},
  author = {Shun Lei and Yixuan Zhou and Liyang Chen and Zhiyong Wu 0001 and Xixin Wu and Shiyin Kang and Helen Meng},
  year = {2023},
  doi = {10.1109/TASLP.2023.3301217},
  url = {https://doi.org/10.1109/TASLP.2023.3301217},
  researchr = {https://researchr.org/publication/LeiZCWWKM23},
  cites = {0},
  citedby = {0},
  journal = {IEEE Transactions on Audio, Speech & Language Processing},
  volume = {31},
  pages = {3290-3303},
}