Diffusion-based diverse audio captioning with retrieval-guided Langevin dynamics

Yonggang Zhu, Aidong Men, Li Xiao 0005. Diffusion-based diverse audio captioning with retrieval-guided Langevin dynamics. Information Fusion, 114:102643, 2025. [doi]

Abstract

Abstract is missing.