Bit-Serial Acceleration of LLM Inference With Mixture-of-Datatype Quantization

Yuzong Chen 0001, Chi-Chih Chang, Xilai Dai, Ahmed F. AbouElhamayed, Marta Andronic, George A. Constantinides, Mohamed S. Abdelfattah. Bit-Serial Acceleration of LLM Inference With Mixture-of-Datatype Quantization. IEEE Transactions on Computers, 75(2):567-581, February 2026. [doi]

Abstract

Abstract is missing.