iCNN-Transformer: An improved CNN-Transformer with Channel-spatial Attention and Keyword Prediction for Automated Audio Captioning

Kun Chen, Jun Wang 0077, Feng Deng, Xiaorui Wang. iCNN-Transformer: An improved CNN-Transformer with Channel-spatial Attention and Keyword Prediction for Automated Audio Captioning. In Hanseok Ko, John H. L. Hansen, editors, Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, Korea, 18-22 September 2022. pages 4167-4171, ISCA, 2022. [doi]

Abstract

Abstract is missing.