Multimodal Grounding for Sequence-to-sequence Speech Recognition

Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze. Multimodal Grounding for Sequence-to-sequence Speech Recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. pages 8648-8652, IEEE, 2019. [doi]

@inproceedings{CaglayanSPBM19,
  title = {Multimodal Grounding for Sequence-to-sequence Speech Recognition},
  author = {Ozan Caglayan and Ramon Sanabria and Shruti Palaskar and Loïc Barrault and Florian Metze},
  year = {2019},
  doi = {10.1109/ICASSP.2019.8682750},
  url = {https://doi.org/10.1109/ICASSP.2019.8682750},
  researchr = {https://researchr.org/publication/CaglayanSPBM19},
  cites = {0},
  citedby = {0},
  pages = {8648-8652},
  booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019},
  publisher = {IEEE},
  isbn = {978-1-4799-8131-1},
}