Multimodal grid features and cell pointers for scene text visual question answering

Lluís Gómez, Ali Furkan Biten, Rubèn Pérez Tito, Andrés Mafla, Marçal Rusiñol, Ernest Valveny, Dimosthenis Karatzas. Multimodal grid features and cell pointers for scene text visual question answering. Pattern Recognition Letters, 150:242-249, 2021. [doi]

Abstract

Abstract is missing.