BARM: A Batch-Aware Resource Manager for Boosting Multiple Neural Networks Inference on GPUs With Memory Oversubscription

Zhao-Wei Qiu, Kun-Sheng Liu, Ya-Shu Chen. BARM: A Batch-Aware Resource Manager for Boosting Multiple Neural Networks Inference on GPUs With Memory Oversubscription. IEEE Trans. Parallel Distrib. Syst., 33(12):4612-4624, 2022. [doi]

Abstract

Abstract is missing.