Attention-Based Multimodal Fusion for Video Description

Chiori Hori, Takaaki Hori, Teng-Yok Lee, Ziming Zhang, Bret Harsham, John R. Hershey, Tim K. Marks, Kazuhiko Sumi. Attention-Based Multimodal Fusion for Video Description. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017. pages 4203-4212, IEEE, 2017. [doi]

Abstract

Abstract is missing.