KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models

Bo Lv, Quan Zhou, Xuanang Ding, Yan Wang, Zeming Ma. KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models. In 2025 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2025, Hyderabad, India, April 6-11, 2025. pages 1-5, IEEE, 2025. [doi]

Abstract

Abstract is missing.