AxCore: A Quantization-Aware Approximate GEMM Unit for LLM Inference

Jiaxiang Zou, Yonghao Chen, Xingyu Chen, Chenxi Xu, Xinyu Chen. AxCore: A Quantization-Aware Approximate GEMM Unit for LLM Inference. In Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, MICRO 2025, Seoul, Republic of Korea, October 18-22, 2025. pages 839-853, ACM, 2025. [doi]

Abstract

Abstract is missing.