From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models

Jiaxian Guo, Junnan Li 0001, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li 0001, Dacheng Tao, Steven C. H. Hoi. From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pages 10867-10877, IEEE, 2023. [doi]

Abstract

Abstract is missing.