AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Zhuohan Li, Lianmin Zheng, Yinmin Zhong, Vincent Liu, Ying Sheng 0007, Xin Jin 0008, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E. Gonzalez, Ion Stoica. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. In Roxana Geambasu, Ed Nightingale, editors, 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2023, Boston, MA, USA, July 10-12, 2023. pages 663-679, USENIX Association, 2023. [doi]

Authors

Zhuohan Li

This author has not been identified. Look up 'Zhuohan Li' in Google

Lianmin Zheng

This author has not been identified. Look up 'Lianmin Zheng' in Google

Yinmin Zhong

This author has not been identified. Look up 'Yinmin Zhong' in Google

Vincent Liu

This author has not been identified. Look up 'Vincent Liu' in Google

Ying Sheng 0007

This author has not been identified. Look up 'Ying Sheng 0007' in Google

Xin Jin 0008

This author has not been identified. Look up 'Xin Jin 0008' in Google

Yanping Huang

This author has not been identified. Look up 'Yanping Huang' in Google

Zhifeng Chen

This author has not been identified. Look up 'Zhifeng Chen' in Google

Hao Zhang

This author has not been identified. Look up 'Hao Zhang' in Google

Joseph E. Gonzalez

This author has not been identified. Look up 'Joseph E. Gonzalez' in Google

Ion Stoica

This author has not been identified. Look up 'Ion Stoica' in Google