Deep Audio-visual System for Closed-set Word-level Speech Recognition

Yougen Yuan, Wei Tang, Minhao Fan, Yue Cao, Peng Zhang 0005, Lei Xie 0001. Deep Audio-visual System for Closed-set Word-level Speech Recognition. In Wen Gao 0001, Helen Mei-Ling Meng, Matthew Turk, Susan R. Fussell, Björn W. Schuller, Yale Song, Kai Yu 0004, editors, International Conference on Multimodal Interaction, ICMI 2019, Suzhou, China, October 14-18, 2019. pages 540-545, ACM, 2019. [doi]

Authors

Yougen Yuan

This author has not been identified. Look up 'Yougen Yuan' in Google

Wei Tang

This author has not been identified. Look up 'Wei Tang' in Google

Minhao Fan

This author has not been identified. Look up 'Minhao Fan' in Google

Yue Cao

This author has not been identified. Look up 'Yue Cao' in Google

Peng Zhang 0005

This author has not been identified. Look up 'Peng Zhang 0005' in Google

Lei Xie 0001

This author has not been identified. Look up 'Lei Xie 0001' in Google