AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Zhuohan Li, Lianmin Zheng, Yinmin Zhong, Vincent Liu, Ying Sheng 0007, Xin Jin 0008, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E. Gonzalez, Ion Stoica. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. In Roxana Geambasu, Ed Nightingale, editors, 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2023, Boston, MA, USA, July 10-12, 2023. pages 663-679, USENIX Association, 2023. [doi]

@inproceedings{LiZZL00HCZGS23,
  title = {AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving},
  author = {Zhuohan Li and Lianmin Zheng and Yinmin Zhong and Vincent Liu and Ying Sheng 0007 and Xin Jin 0008 and Yanping Huang and Zhifeng Chen and Hao Zhang and Joseph E. Gonzalez and Ion Stoica},
  year = {2023},
  url = {https://www.usenix.org/conference/osdi23/presentation/li-zhouhan},
  researchr = {https://researchr.org/publication/LiZZL00HCZGS23},
  cites = {0},
  citedby = {0},
  pages = {663-679},
  booktitle = {17th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2023, Boston, MA, USA, July 10-12, 2023},
  editor = {Roxana Geambasu and Ed Nightingale},
  publisher = {USENIX Association},
}