Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags

Xavier Favory, Konstantinos Drossos, Tuomas Virtanen, Xavier Serra. Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021. pages 596-600, IEEE, 2021. [doi]

Abstract

Abstract is missing.