Let's Roll a BiFTA: Bi-refinement for Fine-grained Text-visual Alignment in Vision-Language Models

Yuhao Sun, Chengyi Cai, Jiacheng Zhang, Zesheng Ye, Xingliang Yuan, Feng Liu 0003. Let's Roll a BiFTA: Bi-refinement for Fine-grained Text-visual Alignment in Vision-Language Models. Trans. Mach. Learn. Res., 2026, 2026. [doi]

Authors

Yuhao Sun

This author has not been identified. Look up 'Yuhao Sun' in Google

Chengyi Cai

This author has not been identified. Look up 'Chengyi Cai' in Google

Jiacheng Zhang

This author has not been identified. Look up 'Jiacheng Zhang' in Google

Zesheng Ye

This author has not been identified. Look up 'Zesheng Ye' in Google

Xingliang Yuan

This author has not been identified. Look up 'Xingliang Yuan' in Google

Feng Liu 0003

This author has not been identified. Look up 'Feng Liu 0003' in Google