Attention-Based Multimodal Deep Learning on Vision-Language Data: Models, Datasets, Tasks, Evaluation Metrics and Applications

Priyankar Bose, Pratip Rana, Preetam Ghosh. Attention-Based Multimodal Deep Learning on Vision-Language Data: Models, Datasets, Tasks, Evaluation Metrics and Applications. IEEE Access, 11:80624-80646, 2023. [doi]

Abstract

Abstract is missing.