Zhaoci Liu, Liping Chen, Ya-Jun Hu, Zhen-Hua Ling, Jia Pan. PE-Wav2vec: A Prosody-Enhanced Speech Model for Self-Supervised Prosody Learning in TTS. IEEE Transactions on Audio, Speech & Language Processing, 32:4199-4210, 2024. [doi]
Abstract is missing.