ViLT-CLIP: Video and Language Tuning CLIP with Multimodal Prompt Learning and Scenario-Guided Optimization

Hao Wang, Fang Liu, Licheng Jiao, Jiahao Wang, Zehua Hao, Shuo Li 0010, Lingling Li 0002, Puhua Chen, Xu Liu 0006. ViLT-CLIP: Video and Language Tuning CLIP with Multimodal Prompt Learning and Scenario-Guided Optimization. In Michael J. Wooldridge, Jennifer G. Dy, Sriraam Natarajan, editors, Thirty-Eigth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada. pages 5390-5400, AAAI Press, 2024. [doi]

Authors

Hao Wang

This author has not been identified. Look up 'Hao Wang' in Google

Fang Liu

This author has not been identified. Look up 'Fang Liu' in Google

Licheng Jiao

This author has not been identified. Look up 'Licheng Jiao' in Google

Jiahao Wang

This author has not been identified. Look up 'Jiahao Wang' in Google

Zehua Hao

This author has not been identified. Look up 'Zehua Hao' in Google

Shuo Li 0010

This author has not been identified. Look up 'Shuo Li 0010' in Google

Lingling Li 0002

This author has not been identified. Look up 'Lingling Li 0002' in Google

Puhua Chen

This author has not been identified. Look up 'Puhua Chen' in Google

Xu Liu 0006

This author has not been identified. Look up 'Xu Liu 0006' in Google