Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR

Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma. Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR. In Hanseok Ko, John H. L. Hansen, editors, Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, Korea, 18-22 September 2022. pages 1016-1020, ISCA, 2022. [doi]

Abstract

Abstract is missing.