InferLine: latency-aware provisioning and scaling for prediction serving pipelines

Daniel Crankshaw, Gur-Eyal Sela, Xiangxi Mo, Corey Zumar, Ion Stoica, Joseph Gonzalez 0001, Alexey Tumanov. InferLine: latency-aware provisioning and scaling for prediction serving pipelines. In Rodrigo Fonseca, Christina Delimitrou, Beng Chin Ooi, editors, SoCC '20: ACM Symposium on Cloud Computing, Virtual Event, USA, October 19-21, 2020. pages 477-491, ACM, 2020. [doi]

Authors

Daniel Crankshaw

This author has not been identified. Look up 'Daniel Crankshaw' in Google

Gur-Eyal Sela

This author has not been identified. Look up 'Gur-Eyal Sela' in Google

Xiangxi Mo

This author has not been identified. Look up 'Xiangxi Mo' in Google

Corey Zumar

This author has not been identified. Look up 'Corey Zumar' in Google

Ion Stoica

This author has not been identified. Look up 'Ion Stoica' in Google

Joseph Gonzalez 0001

This author has not been identified. Look up 'Joseph Gonzalez 0001' in Google

Alexey Tumanov

This author has not been identified. Look up 'Alexey Tumanov' in Google