Jianing Yang, Sheng Li 0010, Takahiro Shinozaki, Yuki Saito 0001, Hiroshi Saruwatari. Emotional Text-To-Speech Based on Mutual-Information-Guided Emotion-Timbre Disentanglement. In Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2025, Singapore, October 22-24, 2025. pages 567-572, IEEE, 2025. [doi]
Abstract is missing.