Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR

Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma. Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR. In Hanseok Ko, John H. L. Hansen, editors, Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, Korea, 18-22 September 2022. pages 1016-1020, ISCA, 2022. [doi]

Authors

Kun Wei

This author has not been identified. Look up 'Kun Wei' in Google

Yike Zhang

This author has not been identified. Look up 'Yike Zhang' in Google

Sining Sun

This author has not been identified. Look up 'Sining Sun' in Google

Lei Xie

This author has not been identified. Look up 'Lei Xie' in Google

Long Ma

This author has not been identified. Look up 'Long Ma' in Google