Tao Li, Zhichao Wang 0002, Xinfa Zhu, Jian Cong, Qiao Tian, Yuping Wang, Lei Xie 0001. U-Style: Cascading U-Nets With Multi-Level Speaker and Style Modeling for Zero-Shot Voice Cloning. IEEE Transactions on Audio, Speech & Language Processing, 32:4026-4035, 2024. [doi]
Abstract is missing.