KVmix: Gradient-Based Layer Importance-Aware Mixed-Precision Quantization for KV Cache

Fei Li, Song Liu 0007, Weiguo Wu, Shiqiang Nie, Jinyu Wang 0002. KVmix: Gradient-Based Layer Importance-Aware Mixed-Precision Quantization for KV Cache. In Sven Koenig, Chad Jenkins, Matthew E. Taylor, editors, Fortieth AAAI Conference on Artificial Intelligence, Thirty-Eighth Conference on Innovative Applications of Artificial Intelligence, Sixteenth Symposium on Educational Advances in Artificial Intelligence, AAAI 2026, Singapore, January 20-27, 2026. pages 31563-31572, AAAI Press, 2026. [doi]

Abstract

Abstract is missing.