Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems

Mehran Salmani, Saeid Ghafouri, Alireza Sanaee, Kamran Razavi, Max Mühlhäuser, Joseph Doyle, Pooyan Jamshidi, Mohsen Sharifi. Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems. In Eiko Yoneki, Luigi Nardi, editors, Proceedings of the 3rd Workshop on Machine Learning and Systems, EuroMLSys 2023, Rome, Italy, 8 May 2023. pages 78-86, ACM, 2023. [doi]

@inproceedings{SalmaniGSRMDJS23,
  title = {Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems},
  author = {Mehran Salmani and Saeid Ghafouri and Alireza Sanaee and Kamran Razavi and Max Mühlhäuser and Joseph Doyle and Pooyan Jamshidi and Mohsen Sharifi},
  year = {2023},
  doi = {10.1145/3578356.3592578},
  url = {https://doi.org/10.1145/3578356.3592578},
  researchr = {https://researchr.org/publication/SalmaniGSRMDJS23},
  cites = {0},
  citedby = {0},
  pages = {78-86},
  booktitle = {Proceedings of the 3rd Workshop on Machine Learning and Systems, EuroMLSys 2023, Rome, Italy, 8 May 2023},
  editor = {Eiko Yoneki and Luigi Nardi},
  publisher = {ACM},
}