Multimodal Grounding for Sequence-to-sequence Speech Recognition

Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze. Multimodal Grounding for Sequence-to-sequence Speech Recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. pages 8648-8652, IEEE, 2019. [doi]

Bibliographies