SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody

Hui Lu, Xixin Wu, Zhiyong Wu 0003, Helen Meng. SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody. In Abdulmotaleb El-Saddik, Tao Mei, Rita Cucchiara, Marco Bertini 0001, Diana Patricia Tobon Vallejo, Pradeep K. Atrey, M. Shamim Hossain, editors, Proceedings of the 31st ACM International Conference on Multimedia, MM 2023, Ottawa, ON, Canada, 29 October 2023- 3 November 2023. pages 2829-2837, ACM, 2023. [doi]

Authors

Hui Lu

This author has not been identified. Look up 'Hui Lu' in Google

Xixin Wu

This author has not been identified. Look up 'Xixin Wu' in Google

Zhiyong Wu 0003

This author has not been identified. Look up 'Zhiyong Wu 0003' in Google

Helen Meng

This author has not been identified. Look up 'Helen Meng' in Google