RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words

Xuying Zhang, Xiaoshuai Sun, Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji. RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. pages 15465-15474, Computer Vision Foundation / IEEE, 2021. [doi]

Abstract

Abstract is missing.