The following publications are possibly variants of this publication:
- Hierarchical Conditional Relation Networks for Multimodal Video Question AnsweringThao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran 0001. ijcv, 129(11):3027-3050, 2021. [doi]
- Action-Centric Relation Transformer Network for Video Question AnsweringJipeng Zhang, Jie Shao, Rui Cao, Lianli Gao, Xing Xu 0001, Heng Tao Shen. tcsv, 32(1):63-74, 2022. [doi]
- Multi-interaction Network with Object Relation for Video Question AnsweringWeike Jin, Zhou Zhao, Mao Gu, Jun Yu 0002, Jun Xiao 0001, Yueting Zhuang. mm 2019: 1193-1201 [doi]
- Hierarchical Representation Network With Auxiliary Tasks for Video Captioning and Video Question AnsweringLianli Gao, Yu Lei, Pengpeng Zeng, Jingkuan Song, Meng Wang 0001, Heng Tao Shen. TIP, 31:202-215, 2022. [doi]
- Long-Form Video Question Answering via Dynamic Hierarchical Reinforced NetworksZhou Zhao, Zhu Zhang, Shuwen Xiao, Zhenxin Xiao, Xiaohui Yan, Jun Yu, Deng Cai, Fei Wu 0001. TIP, 28(12):5939-5952, 2019. [doi]