DisenStudio: Customized Multi-Subject Text-to-Video Generation with Disentangled Spatial Control

Hong Chen, Xin Wang 0019, Yipeng Zhang 0003, Yuwei Zhou, Zeyang Zhang, Siao Tang, Wenwu Zhu 0001. DisenStudio: Customized Multi-Subject Text-to-Video Generation with Disentangled Spatial Control. In Jianfei Cai 0001, Mohan S. Kankanhalli, Balakrishnan Prabhakaran 0001, Susanne Boll, Ramanathan Subramanian, Liang Zheng 0001, Vivek K. Singh 0001, Pablo César, Lexing Xie, Dong Xu 0001, editors, Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024 - 1 November 2024. pages 3637-3646, ACM, 2024. [doi]

Abstract

Abstract is missing.