Enhancing Visual Question Answering with Pre-trained Vision-Language Models: An Ensemble Approach at the LAVA Challenge 2024

Trong-Hieu Nguyen Mau, Nhu-Binh Nguyen Truc, Nhu-Vinh Hoang, Minh-Triet Tran, Hai Dang Nguyen. Enhancing Visual Question Answering with Pre-trained Vision-Language Models: An Ensemble Approach at the LAVA Challenge 2024. In Minsu Cho, Ivan Laptev, Du Tran, Angela Yao, Hongbin Zha, editors, Computer Vision - ACCV 2024 Workshops - 17th Asian Conference on Computer Vision, Hanoi, Vietnam, December 8-12, 2024, Revised Selected Papers, Part I. Volume 15482 of Lecture Notes in Computer Science, pages 281-292, Springer, 2024. [doi]

Abstract

Abstract is missing.