VDT: General-purpose Video Diffusion Transformers via Mask Modeling

Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu 0001, Ping Luo 0002, Mingyu Ding. VDT: General-purpose Video Diffusion Transformers via Mask Modeling. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. [doi]

Authors

Haoyu Lu

This author has not been identified. Look up 'Haoyu Lu' in Google

Guoxing Yang

This author has not been identified. Look up 'Guoxing Yang' in Google

Nanyi Fei

This author has not been identified. Look up 'Nanyi Fei' in Google

Yuqi Huo

This author has not been identified. Look up 'Yuqi Huo' in Google

Zhiwu Lu 0001

This author has not been identified. Look up 'Zhiwu Lu 0001' in Google

Ping Luo 0002

This author has not been identified. Look up 'Ping Luo 0002' in Google

Mingyu Ding

This author has not been identified. Look up 'Mingyu Ding' in Google