A duet of perception and reasoning: CLIP and LLM brainstorming for scene text recognition

Zeguang Jia, Jianming Wang, Kehui Song, Zhilan Wang, Xiaohan Ma, Rize Jin. A duet of perception and reasoning: CLIP and LLM brainstorming for scene text recognition. Neurocomputing, 666:132236, 2026. [doi]

Abstract

Abstract is missing.