OpenViVQA: Task, dataset, and multimodal fusion models for visual question answering in Vietnamese

Nghia Hieu Nguyen, Duong T. D. Vo, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen. OpenViVQA: Task, dataset, and multimodal fusion models for visual question answering in Vietnamese. Information Fusion, 100:101868, December 2023. [doi]

Abstract

Abstract is missing.