Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition

Xichen Pan, Peiyu Chen, Yichen Gong, Helong Zhou, Xinbing Wang, Zhouhan Lin. Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition. In Smaranda Muresan, Preslav Nakov, Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. pages 4491-4503, Association for Computational Linguistics, 2022. [doi]

Authors

Xichen Pan

This author has not been identified. Look up 'Xichen Pan' in Google

Peiyu Chen

This author has not been identified. Look up 'Peiyu Chen' in Google

Yichen Gong

This author has not been identified. Look up 'Yichen Gong' in Google

Helong Zhou

This author has not been identified. Look up 'Helong Zhou' in Google

Xinbing Wang

This author has not been identified. Look up 'Xinbing Wang' in Google

Zhouhan Lin

This author has not been identified. Look up 'Zhouhan Lin' in Google