Ruihan Hu, Songbin Zhou, Zhiri Tang, Sheng Chang, Qijun Huang, Yisen Liu, Wei Han, Edmond Q. Wu. DMMAN: A two-stage audio-visual fusion framework for sound separation and event localization. Neural Networks, 133:229-239, 2021. [doi]
Abstract is missing.