Enhancing Ocean Scene Video Captioning with Multimodal Pre-Training and Video-Swin-Transformer

Xinyu Chen, Meng Zhao, Fan Shi, Meng'en Zhang, Yu He, Shengyong Chen. Enhancing Ocean Scene Video Captioning with Multimodal Pre-Training and Video-Swin-Transformer. In 49th Annual Conference of the IEEE Industrial Electronics Society, IECON 2023, Singapore, October 16-19, 2023. pages 1-6, IEEE, 2023. [doi]

Abstract

Abstract is missing.