DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation

Yongxin Zhu, Zhujin Gao, Xinyuan Zhou, Zhongyi Ye, Linli Xu. DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023. pages 11573-11583, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.