Token-Level Serialized Output Training for Joint Streaming ASR and ST Leveraging Textual Alignments

Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Jinyu Li 0001, Yashesh Gaur. Token-Level Serialized Output Training for Joint Streaming ASR and ST Leveraging Textual Alignments. In IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023, Taipei, Taiwan, December 16-20, 2023. pages 1-8, IEEE, 2023. [doi]

Abstract

Abstract is missing.