BARM: A Batch-Aware Resource Manager for Boosting Multiple Neural Networks Inference on GPUs With Memory Oversubscription

Zhao-Wei Qiu, Kun-Sheng Liu, Ya-Shu Chen. BARM: A Batch-Aware Resource Manager for Boosting Multiple Neural Networks Inference on GPUs With Memory Oversubscription. IEEE Trans. Parallel Distrib. Syst., 33(12):4612-4624, 2022. [doi]

@article{QiuLC22,
  title = {BARM: A Batch-Aware Resource Manager for Boosting Multiple Neural Networks Inference on GPUs With Memory Oversubscription},
  author = {Zhao-Wei Qiu and Kun-Sheng Liu and Ya-Shu Chen},
  year = {2022},
  doi = {10.1109/TPDS.2022.3199806},
  url = {https://doi.org/10.1109/TPDS.2022.3199806},
  researchr = {https://researchr.org/publication/QiuLC22},
  cites = {0},
  citedby = {0},
  journal = {IEEE Trans. Parallel Distrib. Syst.},
  volume = {33},
  number = {12},
  pages = {4612-4624},
}