Multi-Modal Instruction Tuned LLMs with Fine-Grained Visual Perception

Junwen He, Yifan Wang 0004, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo 0008, Xuansong Xie. Multi-Modal Instruction Tuned LLMs with Fine-Grained Visual Perception. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pages 13980-13990, IEEE, 2024. [doi]

Abstract

Abstract is missing.