TeMTG: Text-Enhanced Multi-Hop Temporal Graph Modeling for Audio-Visual Video Parsing

Yaru Chen, Peiliang Zhang, Fei Li, Faegheh Sardari, Ruohao Guo, Zhenbo Li, Wenwu Wang 0001. TeMTG: Text-Enhanced Multi-Hop Temporal Graph Modeling for Audio-Visual Video Parsing. In Zhongfei (Mark) Zhang, Elisa Ricci 0001, Yan Yan 0002, Liqiang Nie, Vincent Oria, Lamberto Ballan, editors, Proceedings of the 2025 International Conference on Multimedia Retrieval, ICMR 2025, Chicago, IL, USA, 30 June 2025 - 3 July 2025. pages 1978-1982, ACM, 2025. [doi]

Abstract

Abstract is missing.