Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning

Chia-Wen Kuo, Zsolt Kira. Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pages 17948-17958, IEEE, 2022. [doi]

Abstract

Abstract is missing.