STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training

Weihong Zhong, Mao Zheng, Duyu Tang, Xuan Luo, Heng Gong, Xiaocheng Feng, Bing Qin 0001. STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training. In Brian Williams 0001, Yiling Chen 0001, Jennifer Neville, editors, Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2023, Washington, DC, USA, February 7-14, 2023. pages 3715-3723, AAAI Press, 2023. [doi]

Abstract

Abstract is missing.