Lazy Batching: An SLA-aware Batching System for Cloud Machine Learning Inference

Yujeong Choi, Yunseong Kim, Minsoo Rhu. Lazy Batching: An SLA-aware Batching System for Cloud Machine Learning Inference. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2021, Seoul, South Korea, February 27 - March 3, 2021. pages 493-506, IEEE, 2021. [doi]

Abstract

Abstract is missing.