AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Zhuohan Li, Lianmin Zheng, Yinmin Zhong, Vincent Liu, Ying Sheng 0007, Xin Jin 0008, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E. Gonzalez, Ion Stoica. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. In Roxana Geambasu, Ed Nightingale, editors, 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2023, Boston, MA, USA, July 10-12, 2023. pages 663-679, USENIX Association, 2023. [doi]

Abstract

Abstract is missing.