Event-Specific Audio-Visual Fusion Layers: A Simple and New Perspective on Video Understanding

Arda Senocak, Junsik Kim 0001, Tae Hyun Oh, Dingzeyu Li, In-So Kweon. Event-Specific Audio-Visual Fusion Layers: A Simple and New Perspective on Video Understanding. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2023, Waikoloa, HI, USA, January 2-7, 2023. pages 2236-2246, IEEE, 2023. [doi]

Abstract

Abstract is missing.