AMPS-Inf: Automatic Model Partitioning for Serverless Inference with Cost Efficiency

Jananie Jarachanthan, Li Chen, Fei Xu, Bo Li. AMPS-Inf: Automatic Model Partitioning for Serverless Inference with Cost Efficiency. In Xian-He Sun, Sameer Shende, Laxmikant V. Kalé, Yong Chen 0001, editors, ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9 - 12, 2021. ACM, 2021. [doi]

Abstract

Abstract is missing.