CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation

Kexin Li, Zongxin Yang, Lei Chen, Yi Yang, Jun Xiao 0001. CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation. In Abdulmotaleb El-Saddik, Tao Mei, Rita Cucchiara, Marco Bertini 0001, Diana Patricia Tobon Vallejo, Pradeep K. Atrey, M. Shamim Hossain, editors, Proceedings of the 31st ACM International Conference on Multimedia, MM 2023, Ottawa, ON, Canada, 29 October 2023- 3 November 2023. pages 1485-1494, ACM, 2023. [doi]

Abstract

Abstract is missing.