Automated Runtime-Aware Scheduling for Multi-Tenant DNN Inference on GPU

Fuxun Yu, Shawn Bray, Di Wang, Longfei Shangguan, Xulong Tang, Chenchen Liu, Xiang Chen. Automated Runtime-Aware Scheduling for Multi-Tenant DNN Inference on GPU. In IEEE/ACM International Conference On Computer Aided Design, ICCAD 2021, Munich, Germany, November 1-4, 2021. pages 1-9, IEEE, 2021. [doi]

Abstract

Abstract is missing.