From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA

Zan-Xia Jin, Mike Zheng Shou, Fang Zhou, Satoshi Tsutsui, Jingyan Qin, Xu-Cheng Yin. From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA. In João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh 0001, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni, editors, MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022. pages 4564-4572, ACM, 2022. [doi]

Abstract

Abstract is missing.