Multimodal Network with Cross-Modal Attention for Audio-Visual Event Localization

Qianchao Tan. Multimodal Network with Cross-Modal Attention for Audio-Visual Event Localization. In Dingwen Zhang, Chaowei Fang, Wu Liu, Xinchen Liu, Jingkuan Song, Hongyuan Zhu, Wenbing Huang 0001, John Smith, editors, HCMA@MM 2022: Proceedings of the 3rd International Workshop on Human-Centric Multimedia Analysis, Lisboa, Portugal, October 10, 2022. pages 71-78, ACM, 2022. [doi]

Abstract

Abstract is missing.