Ming Chen, Yingming Li, Zhongfei Zhang, Siyu Huang. TVT: Two-View Transformer Network for Video Captioning. In Jun Zhu 0001, Ichiro Takeuchi, editors, Proceedings of The 10th Asian Conference on Machine Learning, ACML 2018, Beijing, China, November 14-16, 2018. Volume 95 of Proceedings of Machine Learning Research, pages 847-862, JMLR.org, 2018. [doi]
Abstract is missing.