Bridging Facial Imagery and Vocal Reality: Stable Diffusion-Enhanced Voice Generation

Yueqian Lin, Dong Liu, Yunfei Xu, Hongbin Suo, Ming Li. Bridging Facial Imagery and Vocal Reality: Stable Diffusion-Enhanced Voice Generation. In Yanmin Qian, Qin Jin, Zhijian Ou, Zhenhua Ling, Zhiyong Wu, Ya Li, Lei Xie 0001, Jianhua Tao 0001, editors, 14th IEEE International Symposium on Chinese Spoken Language Processing, ISCSLP 2024, Beijing, China, November 7-10, 2024. pages 229-233, IEEE, 2024. [doi]

Abstract

Abstract is missing.