Ludan Ruan, Yiyang Ma, Huan Yang 0005, Huiguo He, Bei Liu 0001, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo. MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pages 10219-10228, IEEE, 2023. [doi]
Abstract is missing.