Learning Joint Representations of Videos and Sentences with Web Image Search

Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya. Learning Joint Representations of Videos and Sentences with Web Image Search. In Gang Hua, Hervé Jégou, editors, Computer Vision - ECCV 2016 Workshops - Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part I. Volume 9913 of Lecture Notes in Computer Science, pages 651-667, Springer, 2016. [doi]

Abstract

Abstract is missing.