X-STA: Cross-Modal Spatial-Temporal Alignment Network for Unified Audio-Visual Segmentation

Hanyu Xuan, Tongxing Liu, Wenxiang Dong, Zhongheng Li, Shuo Chen. X-STA: Cross-Modal Spatial-Temporal Alignment Network for Unified Audio-Visual Segmentation. IEEE Signal Process. Lett., 32:2883-2887, 2025. [doi]

Abstract

Abstract is missing.