End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features

Chiori Hori, Huda Alamri, Jue Wang 0010, Gordon Wichern, Takaaki Hori, Anoop Cherian, Tim K. Marks, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Irfan Essa, Dhruv Batra, Devi Parikh. End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. pages 2352-2356, IEEE, 2019. [doi]

Authors

Chiori Hori

This author has not been identified. Look up 'Chiori Hori' in Google

Huda Alamri

This author has not been identified. Look up 'Huda Alamri' in Google

Jue Wang 0010

This author has not been identified. Look up 'Jue Wang 0010' in Google

Gordon Wichern

This author has not been identified. Look up 'Gordon Wichern' in Google

Takaaki Hori

This author has not been identified. Look up 'Takaaki Hori' in Google

Anoop Cherian

This author has not been identified. Look up 'Anoop Cherian' in Google

Tim K. Marks

This author has not been identified. Look up 'Tim K. Marks' in Google

Vincent Cartillier

This author has not been identified. Look up 'Vincent Cartillier' in Google

Raphael Gontijo Lopes

This author has not been identified. Look up 'Raphael Gontijo Lopes' in Google

Abhishek Das

This author has not been identified. Look up 'Abhishek Das' in Google

Irfan Essa

This author has not been identified. Look up 'Irfan Essa' in Google

Dhruv Batra

This author has not been identified. Look up 'Dhruv Batra' in Google

Devi Parikh

This author has not been identified. Look up 'Devi Parikh' in Google