WaveTransformer: An Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information

An Tran, Konstantinos Drossos, Tuomas Virtanen. WaveTransformer: An Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information. In 29th European Signal Processing Conference, EUSIPCO 2021, Dublin, Ireland, August 23-27, 2021. pages 576-580, IEEE, 2021. [doi]

Abstract

Abstract is missing.