Nanily: A QoS-Aware Scheduling for DNN Inference Workload in Clouds

Xuehai Tang, Peng Wang, Qiuyang Liu, Wang Wang, Jizhong Han. Nanily: A QoS-Aware Scheduling for DNN Inference Workload in Clouds. In Zheng Xiao, Laurence T. Yang, Pavan Balaji, Tao Li, Keqin Li 0001, Albert Y. Zomaya, editors, 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, Zhangjiajie, China, August 10-12, 2019. pages 2395-2402, IEEE, 2019. [doi]

Abstract

Abstract is missing.