Textual supervision for visually grounded spoken language understanding

Bertrand Higy, Desmond Elliott, Grzegorz Chrupala. Textual supervision for visually grounded spoken language understanding. In Trevor Cohn, Yulan He, Yang Liu, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020, Online Event, 16-20 November 2020. pages 2698-2709, Association for Computational Linguistics, 2020. [doi]

Bibliographies