AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models

Sangjun Lee, Seung-taek Woo, Jungyu Jin, Changhun Lee, Eunhyeok Park. AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models. In Christos Christodoulopoulos 0001, Tanmoy Chakraborty 0002, Carolyn Rose, Violet Peng, editors, Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, EMNLP 2025, Suzhou, China, November 4-9, 2025. pages 35532-35550, Association for Computational Linguistics, 2025. [doi]

Abstract

Abstract is missing.