The following publications are possibly variants of this publication:
- Temporal-attentive Covariance Pooling Networks for Video RecognitionZilin Gao, Qilong Wang, Bingbing Zhang, Qinghua Hu, Peihua Li. nips 2021: 13587-13598 [doi]
- GHAN: Graph-Based Hierarchical Aggregation Network for Text-Video RetrievalYahan Yu, Bojie Hu, Yu Li. emnlp 2022: 5547-5557 [doi]
- What Matters: Attentive and Relational Feature Aggregation Network for Video-Text RetrievalXiaoshuai Hao, Yucan Zhou, Dayan Wu, Wanqian Zhang, Bo Li, Weiping Wang, Dan Meng. icmcs 2021: 1-6 [doi]