3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

He Bai, Renjie Zheng, Junkun Chen, Mingbo Ma, Xintong Li, Liang Huang 0001. 3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu 0001, Sivan Sabato, editors, International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Volume 162 of Proceedings of Machine Learning Research, pages 1399-1411, PMLR, 2022. [doi]

Abstract

Abstract is missing.