Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu 0001, Ping Luo 0002, Mingyu Ding. VDT: General-purpose Video Diffusion Transformers via Mask Modeling. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. [doi]
Abstract is missing.