Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis

researchr

You are not signed in
Sign in
Sign up

Xueyuan Chen, Shun Lei, Zhiyong Wu 0001, Dong Xu, Weifeng Zhao, Helen Meng. Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis. In Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, YoungGyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na, editors, Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022. pages 7193-7202, International Committee on Computational Linguistics, 2022. [doi]

@inproceedings{ChenL0XZM22,
  title = {Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis},
  author = {Xueyuan Chen and Shun Lei and Zhiyong Wu 0001 and Dong Xu and Weifeng Zhao and Helen Meng},
  year = {2022},
  url = {https://aclanthology.org/2022.coling-1.630},
  researchr = {https://researchr.org/publication/ChenL0XZM22},
  cites = {0},
  citedby = {0},
  pages = {7193-7202},
  booktitle = {Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022},
  editor = {Nicoletta Calzolari and Chu-Ren Huang and Hansaem Kim and James Pustejovsky and Leo Wanner and Key-Sun Choi and Pum-Mo Ryu and Hsin-Hsi Chen and Lucia Donatelli and Heng Ji and Sadao Kurohashi and Patrizia Paggio and Nianwen Xue and Seokhwan Kim and YoungGyun Hahm and Zhong He and Tony Kyungil Lee and Enrico Santus and Francis Bond and Seung-Hoon Na},
  publisher = {International Committee on Computational Linguistics},
}

External Links

Cite Key

Statistics

PDF

Researchr

Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis