LUT-LLM: Efficient Language Model Inference with Memory-based Computations on FPGAs

Zifan He, Shengyu Ye, Rui Ma, Yang Wang 0053, Jason Cong. LUT-LLM: Efficient Language Model Inference with Memory-based Computations on FPGAs. In 34th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2026, Atlanta, GA, USA, May 13-16, 2026. pages 109-118, IEEE, 2026. [doi]

Abstract

Abstract is missing.