AVCLNet: Multimodal Multispeaker Tracking Network Using Audio-Visual Contrastive Learning

Yihan Li, Yidi Li, Zhenhuan Xu, Hao Guo, Mengyuan Liu 0001, Weiwei Wan. AVCLNet: Multimodal Multispeaker Tracking Network Using Audio-Visual Contrastive Learning. CAAI Trans. Intell. Technol., 11(1):238-255, 2026. [doi]

Authors

Yihan Li

This author has not been identified. Look up 'Yihan Li' in Google

Yidi Li

This author has not been identified. Look up 'Yidi Li' in Google

Zhenhuan Xu

This author has not been identified. Look up 'Zhenhuan Xu' in Google

Hao Guo

This author has not been identified. Look up 'Hao Guo' in Google

Mengyuan Liu 0001

This author has not been identified. Look up 'Mengyuan Liu 0001' in Google

Weiwei Wan

This author has not been identified. Look up 'Weiwei Wan' in Google