CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system

Qi Zhang, Yi Liu, Tao Liu, Depei Qian. CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system. The Journal of Supercomputing, 79(13):14172-14199, September 2023. [doi]

@article{ZhangLLQ23,
  title = {CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system},
  author = {Qi Zhang and Yi Liu and Tao Liu and Depei Qian},
  year = {2023},
  month = {September},
  doi = {10.1007/s11227-023-05183-6},
  url = {https://doi.org/10.1007/s11227-023-05183-6},
  researchr = {https://researchr.org/publication/ZhangLLQ23},
  cites = {0},
  citedby = {0},
  journal = {The Journal of Supercomputing},
  volume = {79},
  number = {13},
  pages = {14172-14199},
}