Efficient CUDA stream management for multi-DNN real-time inference on embedded GPUs

Weiguang Pang, Xiantong Luo, Kailun Chen, Dong Ji, Lei Qiao, Wang Yi 0001. Efficient CUDA stream management for multi-DNN real-time inference on embedded GPUs. Journal of Systems Architecture, 139:102888, 2023. [doi]

Abstract

Abstract is missing.