FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs

Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu 0004, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang. FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs. In Zhiru Zhang, Andrew Putnam, editors, Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2024, Monterey, CA, USA, March 3-5, 2024. pages 223-234, ACM, 2024. [doi]

Abstract

Abstract is missing.