HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis

Sang-Hoon Lee, Seung-bin Kim, Ji-Hyun Lee, Eunwoo Song, Min-Jae Hwang, Seong-Whan Lee. HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. 2022. [doi]

Authors

Sang-Hoon Lee

This author has not been identified. Look up 'Sang-Hoon Lee' in Google

Seung-bin Kim

This author has not been identified. Look up 'Seung-bin Kim' in Google

Ji-Hyun Lee

This author has not been identified. Look up 'Ji-Hyun Lee' in Google

Eunwoo Song

This author has not been identified. Look up 'Eunwoo Song' in Google

Min-Jae Hwang

This author has not been identified. Look up 'Min-Jae Hwang' in Google

Seong-Whan Lee

This author has not been identified. Look up 'Seong-Whan Lee' in Google