Towards Optimal Preemptive GPU Time-Sharing for Edge Model Serving

Zhengxu Xia, Yitian Hao, Jun Duan, Chen Wang, Junchen Jiang. Towards Optimal Preemptive GPU Time-Sharing for Edge Model Serving. In Proceedings of the 9th International Workshop on Container Technologies and Container Clouds, WoC 2023, Bologna, Italy, December 11-15, 2023. pages 13-18, ACM, 2023. [doi]

Abstract

Abstract is missing.