CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system

Qi Zhang, Yi Liu, Tao Liu, Depei Qian. CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system. The Journal of Supercomputing, 79(13):14172-14199, September 2023. [doi]

Abstract

Abstract is missing.