OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

Tom Tongjia Chen, Hongshan Yu, Zhengeng Yang, Zechuan Li, Wei Sun 0028, Chen Chen 0001. OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pages 18888-18898, IEEE, 2024. [doi]

Abstract

Abstract is missing.