Yanying Lin, Shuaipeng Wu, Shutian Luo, Hong Xu 0001, Haiying Shen, Chong Ma, Min Shen, Le Chen, Chengzhong Xu 0001, Lin Qu, Kejiang Ye. Understanding Diffusion Model Serving in Production: A Top-Down Analysis of Workload, Scheduling, and Resource Efficiency. In Proceedings of the 2025 ACM Symposium on Cloud Computing, SoCC 2025, Online, USA, November 19-21, 2025. pages 1-15, ACM, 2025. [doi]
Abstract is missing.