Enhancing Visual Grounding in Vision-Language Pre-Training With Position-Guided Text Prompts

Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan. Enhancing Visual Grounding in Vision-Language Pre-Training With Position-Guided Text Prompts. IEEE Trans. Pattern Anal. Mach. Intell., 46(5):3406-3421, May 2024. [doi]

Abstract

Abstract is missing.