MVQ: Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization

Shuaiting Li, Chengxuan Wang, Juncan Deng, Zeyu Wang, Zewen Ye, Zongsheng Wang, Haibin Shen, Kejie Huang. MVQ: Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization. In Lieven Eeckhout, Georgios Smaragdakis, Kaitai Liang, Adrian Sampson, Martha A. Kim, Christopher J. Rossbach, editors, Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, ASPLOS 2025, Rotterdam, The Netherlands, 30 March 2025 - 3 April 2025. pages 731-745, ACM, 2025. [doi]

Abstract

Abstract is missing.