Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences

Mingcong Han, Hanze Zhang, Rong Chen 0001, Haibo Chen 0001. Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences. In Marcos K. Aguilera, Hakim Weatherspoon, editors, 16th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2022, Carlsbad, CA, USA, July 11-13, 2022. pages 539-558, USENIX Association, 2022. [doi]

Abstract

Abstract is missing.