Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events

Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu 0004. Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021. pages 606-610, IEEE, 2021. [doi]

Abstract

Abstract is missing.