TASTA: Text-Assisted Spatial and Temporal Attention Network for Video Question Answering

Tian Wang 0002, Boyao Hou, Jiakun Li, Peng Shi, Baochang Zhang 0001, Hichem Snoussi. TASTA: Text-Assisted Spatial and Temporal Attention Network for Video Question Answering. Adv. Intell. Syst., 5(4), April 2023. [doi]

Abstract

Abstract is missing.