Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Junyan Lin, Haoran Chen, Yue Fan, Yingqi Fan, Xin Jin, Hui Su, JinLan Fu, Xiaoyu Shen. Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pages 4156-4166, Computer Vision Foundation / IEEE, 2025. [doi]

This author has not been identified. Look up 'Junyan Lin' in GoogleThis author has not been identified. Look up 'Haoran Chen' in GoogleThis author has not been identified. Look up 'Yue Fan' in GoogleThis author has not been identified. Look up 'Yingqi Fan' in GoogleThis author has not been identified. Look up 'Xin Jin' in GoogleThis author has not been identified. Look up 'Hui Su' in GoogleThis author has not been identified. Look up 'JinLan Fu' in GoogleThis author has not been identified. Look up 'Xiaoyu Shen' in Google

runs on WebDSL