X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers

Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi. X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers. In Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020. pages 8785-8805, Association for Computational Linguistics, 2020. [doi]

Abstract

Abstract is missing.