Deep Audio-visual System for Closed-set Word-level Speech Recognition

Yougen Yuan, Wei Tang, Minhao Fan, Yue Cao, Peng Zhang 0005, Lei Xie 0001. Deep Audio-visual System for Closed-set Word-level Speech Recognition. In Wen Gao 0001, Helen Mei-Ling Meng, Matthew Turk, Susan R. Fussell, Björn W. Schuller, Yale Song, Kai Yu 0004, editors, International Conference on Multimodal Interaction, ICMI 2019, Suzhou, China, October 14-18, 2019. pages 540-545, ACM, 2019. [doi]

@inproceedings{YuanTFC0019,
  title = {Deep Audio-visual System for Closed-set Word-level Speech Recognition},
  author = {Yougen Yuan and Wei Tang and Minhao Fan and Yue Cao and Peng Zhang 0005 and Lei Xie 0001},
  year = {2019},
  doi = {10.1145/3340555.3356102},
  url = {https://doi.org/10.1145/3340555.3356102},
  researchr = {https://researchr.org/publication/YuanTFC0019},
  cites = {0},
  citedby = {0},
  pages = {540-545},
  booktitle = {International Conference on Multimodal Interaction, ICMI 2019, Suzhou, China, October 14-18, 2019},
  editor = {Wen Gao 0001 and Helen Mei-Ling Meng and Matthew Turk and Susan R. Fussell and Björn W. Schuller and Yale Song and Kai Yu 0004},
  publisher = {ACM},
  isbn = {978-1-4503-6860-5},
}