The following publications are possibly variants of this publication:
- SPTNET: Span-based Prompt Tuning for Video GroundingYiren Zhang, Yuanwu Xu, MoHan Chen, Yuejie Zhang, Rui Feng, Shang Gao 0003. icmcs 2023: 2807-2812 [doi]
- CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal GroundingZhijian Hou, Wanjun Zhong, Lei Ji 0001, Difei Gao, Kun Yan, Wing Kwong Chan, Chong-Wah Ngo, Mike Zheng Shou, Nan Duan. acl 2023: 8013-8028 [doi]
- Fine-Grained Text-to-Video Temporal Grounding from Coarse BoundaryJiachang Hao, Haifeng Sun 0001, Pengfei Ren, Yiming Zhong, Jingyu Wang 0001, Qi Qi 0001, Jianxin Liao. tomccap, 19(5), 2023. [doi]
- STVGBert: A Visual-linguistic Transformer based Framework for Spatio-temporal Video GroundingRui Su, Qian Yu, Dong Xu 0001. iccv 2021: 1513-1522 [doi]
- Human-Centric Spatio-Temporal Video Grounding With Visual TransformersZongheng Tang, Yue Liao, Si Liu 0001, Guanbin Li, Xiaojie Jin, Hongxu Jiang, Qian Yu, Dong Xu 0001. tcsv, 32(12):8238-8249, 2022. [doi]
- ProTéGé: Untrimmed Pretraining for Video Temporal Grounding by Video Temporal GroundingLan Wang, Gaurav Mittal, Sandra Sajeev, Ye Yu 0003, Matthew Hall, Vishnu Naresh Boddeti, Mei Chen. cvpr 2023: 6575-6585 [doi]