Spatio-Temporal Attention Models for Grounded Video Captioning

Mihai Zanfir, Elisabeta Marinoiu, Cristian Sminchisescu. Spatio-Temporal Attention Models for Grounded Video Captioning. In Shang-Hong Lai, Vincent Lepetit, Ko Nishino, Yoichi Sato, editors, Computer Vision - ACCV 2016 - 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part IV. Volume 10114 of Lecture Notes in Computer Science, pages 104-119, Springer, 2016. [doi]

Abstract

Abstract is missing.