BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache

Dayou Du, Shijie Cao, Jianyi Cheng, Luo Mai, Ting Cao 0003, Mao Yang 0004. BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache. In IEEE International Symposium on High Performance Computer Architecture, HPCA 2026, Sydney, Australia, January 31 - Feb. 4, 2026. pages 1-13, IEEE, 2026. [doi]

Abstract

Abstract is missing.