Focal and Composed Vision-semantic Modeling for Visual Question Answering

Yudong Han, Yangyang Guo, Jianhua Yin, Meng Liu 0006, Yupeng Hu, Liqiang Nie. Focal and Composed Vision-semantic Modeling for Visual Question Answering. In Heng Tao Shen, Yueting Zhuang, John R. Smith, Yang Yang, Pablo Cesar, Florian Metze, Balakrishnan Prabhakaran, editors, MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021. pages 4528-4536, ACM, 2021. [doi]

Abstract

Abstract is missing.