CMGNet: Collaborative multi-modal graph network for video captioning

Qi Rao, Xin Yu 0002, Guang Li, Linchao Zhu. CMGNet: Collaborative multi-modal graph network for video captioning. Computer Vision and Image Understanding, 238:103864, January 2024. [doi]

Abstract

Abstract is missing.