Defying Distractions in Multimodal Tasks: A Novel Benchmark for Large Vision-Language Models

Jinhui Yang, Ming Jiang 0019, Qi Zhao 0001. Defying Distractions in Multimodal Tasks: A Novel Benchmark for Large Vision-Language Models. IEEE Trans. Pattern Anal. Mach. Intell., 48(6):6314-6331, June 2026. [doi]

Abstract

Abstract is missing.