Sequence-Aware Learnable Sparse Mask for Frame-Selectable End-to-End Dense Video Captioning for IoT Smart Cameras

Syu-Huei Huang, Ching-Hu Lu. Sequence-Aware Learnable Sparse Mask for Frame-Selectable End-to-End Dense Video Captioning for IoT Smart Cameras. IEEE Internet of Things Journal, 11(7):13039-13050, April 2024. [doi]

Abstract

Abstract is missing.