Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems

Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi. Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems. In Anna Korhonen, David R. Traum, Lluís Màrquez, editors, Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. pages 5612-5623, Association for Computational Linguistics, 2019. [doi]

Abstract

Abstract is missing.