Understanding Diffusion Model Serving in Production: A Top-Down Analysis of Workload, Scheduling, and Resource Efficiency

Yanying Lin, Shuaipeng Wu, Shutian Luo, Hong Xu 0001, Haiying Shen, Chong Ma, Min Shen, Le Chen, Chengzhong Xu 0001, Lin Qu, Kejiang Ye. Understanding Diffusion Model Serving in Production: A Top-Down Analysis of Workload, Scheduling, and Resource Efficiency. In Proceedings of the 2025 ACM Symposium on Cloud Computing, SoCC 2025, Online, USA, November 19-21, 2025. pages 1-15, ACM, 2025. [doi]

Abstract

Abstract is missing.