Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues

Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Tomohiro Nakatani. Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues. In Gernot Kubin, Zdravko Kacic, editors, Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019. pages 2718-2722, ISCA, 2019. [doi]

Abstract

Abstract is missing.