Chiori Hori, Huda Alamri, Jue Wang 0010, Gordon Wichern, Takaaki Hori, Anoop Cherian, Tim K. Marks, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Irfan Essa, Dhruv Batra, Devi Parikh. End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. pages 2352-2356, IEEE, 2019. [doi]
Abstract is missing.