Unified Cross-Modal Attention: Robust Audio-Visual Speech Recognition and Beyond

Jiahong Li, Chenda Li, Yifei Wu, Yanmin Qian. Unified Cross-Modal Attention: Robust Audio-Visual Speech Recognition and Beyond. IEEE Transactions on Audio, Speech & Language Processing, 32:1941-1953, 2024. [doi]

Abstract

Abstract is missing.