Multimodal Attention for Fusion of Audio and Spatiotemporal Features for Video Description

Chiori Hori, Takaaki Hori, Gordon Wichern, Jue Wang, Teng-Yok Lee, Anoop Cherian, Tim K. Marks. Multimodal Attention for Fusion of Audio and Spatiotemporal Features for Video Description. In 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2018, Salt Lake City, UT, USA, June 18-22, 2018. pages 2528-2531, IEEE Computer Society, 2018. [doi]

Abstract

Abstract is missing.