A Transformer-Based Audio Captioning Model with Keyword Estimation

Yuma Koizumi, Ryo Masumura, Kyosuke Nishida, Masahiro Yasuda, Shoichiro Saito. A Transformer-Based Audio Captioning Model with Keyword Estimation. In Helen Meng, Bo Xu 0011, Thomas Fang Zheng, editors, Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020. pages 1977-1981, ISCA, 2020. [doi]

Authors

Yuma Koizumi

This author has not been identified. Look up 'Yuma Koizumi' in Google

Ryo Masumura

This author has not been identified. Look up 'Ryo Masumura' in Google

Kyosuke Nishida

This author has not been identified. Look up 'Kyosuke Nishida' in Google

Masahiro Yasuda

This author has not been identified. Look up 'Masahiro Yasuda' in Google

Shoichiro Saito

This author has not been identified. Look up 'Shoichiro Saito' in Google