A 51.6μJ/Token Subspace-Rotation-Based Dual-Quantized Large-language-Model Accelerator with Fused Scale-Activation INT Datapath and Rearranged Bit-Slice LUT Computation

Bo Liu 0019, Zihan Zou, Xinming Yan, Xilong Kang, Xinyang Chen, Bo Hu, Jiaming Lin, Haoran Du, Jun Yang 0006, Xin Si, Hao Cai 0001. A 51.6μJ/Token Subspace-Rotation-Based Dual-Quantized Large-language-Model Accelerator with Fused Scale-Activation INT Datapath and Rearranged Bit-Slice LUT Computation. In IEEE International Solid-State Circuits Conference, ISSCC 2026, San Francisco, CA, USA, February 15-19, 2026. pages 536-538, IEEE, 2026. [doi]

Abstract

Abstract is missing.