InferLine: latency-aware provisioning and scaling for prediction serving pipelines

Daniel Crankshaw, Gur-Eyal Sela, Xiangxi Mo, Corey Zumar, Ion Stoica, Joseph Gonzalez 0001, Alexey Tumanov. InferLine: latency-aware provisioning and scaling for prediction serving pipelines. In Rodrigo Fonseca, Christina Delimitrou, Beng Chin Ooi, editors, SoCC '20: ACM Symposium on Cloud Computing, Virtual Event, USA, October 19-21, 2020. pages 477-491, ACM, 2020. [doi]

@inproceedings{CrankshawSMZS0T20,
  title = {InferLine: latency-aware provisioning and scaling for prediction serving pipelines},
  author = {Daniel Crankshaw and Gur-Eyal Sela and Xiangxi Mo and Corey Zumar and Ion Stoica and Joseph Gonzalez 0001 and Alexey Tumanov},
  year = {2020},
  doi = {10.1145/3419111.3421285},
  url = {https://doi.org/10.1145/3419111.3421285},
  researchr = {https://researchr.org/publication/CrankshawSMZS0T20},
  cites = {0},
  citedby = {0},
  pages = {477-491},
  booktitle = {SoCC '20: ACM Symposium on Cloud Computing, Virtual Event, USA, October 19-21, 2020},
  editor = {Rodrigo Fonseca and Christina Delimitrou and Beng Chin Ooi},
  publisher = {ACM},
  isbn = {978-1-4503-8137-6},
}