Pushing Bit-Width Limits in LLM Quantization with Saliency-Guided Mix-Precision Allocation and Learnable Affine Transformation

Shuoyu Ma, Wenrui Dai, Maida Cao, Shaohui Li, Ziyang Zheng, Chenglin Li, Junni Zou, Hongkai Xiong. Pushing Bit-Width Limits in LLM Quantization with Saliency-Guided Mix-Precision Allocation and Learnable Affine Transformation. In Data Compression Conference, DCC 2026, Snowbird, UT, USA, March 24-27, 2026. pages 455, IEEE, 2026. [doi]

Abstract

Abstract is missing.