Temporal Modeling Approach for Video Action Recognition Based on Vision-language Models

Yue Huang, Xiaodong Gu 0001. Temporal Modeling Approach for Video Action Recognition Based on Vision-language Models. In Biao Luo, Long Cheng 0001, Zheng-Guang Wu, Hongyi Li 0001, Chaojie Li, editors, Neural Information Processing - 30th International Conference, ICONIP 2023, Changsha, China, November 20-23, 2023, Proceedings, Part III. Volume 14449 of Lecture Notes in Computer Science, pages 512-523, Springer, 2023. [doi]

@inproceedings{HuangG23-9,
  title = {Temporal Modeling Approach for Video Action Recognition Based on Vision-language Models},
  author = {Yue Huang and Xiaodong Gu 0001},
  year = {2023},
  doi = {10.1007/978-981-99-8067-3_38},
  url = {https://doi.org/10.1007/978-981-99-8067-3_38},
  researchr = {https://researchr.org/publication/HuangG23-9},
  cites = {0},
  citedby = {0},
  pages = {512-523},
  booktitle = {Neural Information Processing - 30th International Conference, ICONIP 2023, Changsha, China, November 20-23, 2023, Proceedings, Part III},
  editor = {Biao Luo and Long Cheng 0001 and Zheng-Guang Wu and Hongyi Li 0001 and Chaojie Li},
  volume = {14449},
  series = {Lecture Notes in Computer Science},
  publisher = {Springer},
  isbn = {978-981-99-8067-3},
}