Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders

Nicola Messina, Giuseppe Amato, Andrea Esuli, Fabrizio Falchi, Claudio Gennaro, Stéphane Marchand-Maillet. Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders. TOMCCAP, 17(4), 2021. [doi]

Abstract

Abstract is missing.