Multimodal active speaker detection using cross-attention and contextual information

Bogdan Mocanu, Ruxandra Tapu. Multimodal active speaker detection using cross-attention and contextual information. In IEEE International Conference on Consumer Electronics, ICCE 2024, Las Vegas, NV, USA, January 6-8, 2024. pages 1-2, IEEE, 2024. [doi]

Abstract

Abstract is missing.