Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving - researchr publication

researchr

You are not signed in
Sign in
Sign up

Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng 0001, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen 0001, Baris Kasikci. Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving. In Phillip B. Gibbons, Gennady Pekhimenko, Christopher De Sa, editors, Proceedings of the Seventh Annual Conference on Machine Learning and Systems, MLSys 2024, Santa Clara, CA, USA, May 13-16, 2024. mlsys.org, 2024. [doi]

Abstract is missing.

runs on WebDSL