WaveTransformer: An Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information

An Tran, Konstantinos Drossos, Tuomas Virtanen. WaveTransformer: An Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information. In 29th European Signal Processing Conference, EUSIPCO 2021, Dublin, Ireland, August 23-27, 2021. pages 576-580, IEEE, 2021. [doi]

Authors

An Tran

This author has not been identified. Look up 'An Tran' in Google

Konstantinos Drossos

This author has not been identified. Look up 'Konstantinos Drossos' in Google

Tuomas Virtanen

This author has not been identified. Look up 'Tuomas Virtanen' in Google