SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training

Ziqiang Zhang, Long Zhou, Junyi Ao, Shujie Liu 0001, Lirong Dai, Jinyu Li 0001, Furu Wei. SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training. In Yoav Goldberg, Zornitsa Kozareva, Yue Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11. pages 1663-1676, Association for Computational Linguistics, 2022. [doi]

@inproceedings{ZhangZA0D0W22,
  title = {SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training},
  author = {Ziqiang Zhang and Long Zhou and Junyi Ao and Shujie Liu 0001 and Lirong Dai and Jinyu Li 0001 and Furu Wei},
  year = {2022},
  url = {https://aclanthology.org/2022.emnlp-main.108},
  researchr = {https://researchr.org/publication/ZhangZA0D0W22},
  cites = {0},
  citedby = {0},
  pages = {1663-1676},
  booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11},
  editor = {Yoav Goldberg and Zornitsa Kozareva and Yue Zhang},
  publisher = {Association for Computational Linguistics},
}