HaViT: Hybrid-Attention Based Vision Transformer for Video Classification

Li Li, Liansheng Zhuang, Shenghua Gao, Shafei Wang. HaViT: Hybrid-Attention Based Vision Transformer for Video Classification. In Lei Wang 0001, Juergen Gall, Tat-Jun Chin, Imari Sato, Rama Chellappa, editors, Computer Vision - ACCV 2022 - 16th Asian Conference on Computer Vision, Macao, China, December 4-8, 2022, Proceedings, Part IV. Volume 13844 of Lecture Notes in Computer Science, pages 502-517, Springer, 2022. [doi]

Abstract

Abstract is missing.