VioLET: Vision-Language Efficient Tuning with Collaborative Multi-modal Gradients

Yaoming Wang, Yuchen Liu, Xiaopeng Zhang, Jin Li, Bowen Shi, Chenglin Li, Wenrui Dai, Hongkai Xiong, Qi Tian 0001. VioLET: Vision-Language Efficient Tuning with Collaborative Multi-modal Gradients. In Abdulmotaleb El-Saddik, Tao Mei, Rita Cucchiara, Marco Bertini 0001, Diana Patricia Tobon Vallejo, Pradeep K. Atrey, M. Shamim Hossain, editors, Proceedings of the 31st ACM International Conference on Multimedia, MM 2023, Ottawa, ON, Canada, 29 October 2023- 3 November 2023. pages 4595-4605, ACM, 2023. [doi]

Abstract

Abstract is missing.