End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features

Chiori Hori, Huda Alamri, Jue Wang 0010, Gordon Wichern, Takaaki Hori, Anoop Cherian, Tim K. Marks, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Irfan Essa, Dhruv Batra, Devi Parikh. End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. pages 2352-2356, IEEE, 2019. [doi]

Abstract

Abstract is missing.