Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling

Sohaib Ahmad, Hui Guan, Brian D. Friedman, Thomas Williams, Ramesh K. Sitaraman, Thomas Y. C. Woo. Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling. In Rajiv Gupta 0001, Nael B. Abu-Ghazaleh, Madan Musuvathi, Dan Tsafrir, editors, Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, ASPLOS 2024, La Jolla, CA, USA, 27 April 2024- 1 May 2024. pages 318-334, ACM, 2024. [doi]

Abstract

Abstract is missing.