oFFN: Outlier and Neuron-aware Structured FFN for Fast yet Accurate LLM Inference

Geunsoo Song, Hoeseok Yang, Youngmin Yi. oFFN: Outlier and Neuron-aware Structured FFN for Fast yet Accurate LLM Inference. In Benjamin C. Lee, Harry Xu 0001, Mark Silberstein, Bingyao Li, editors, Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ASPLOS 2026, Pittsburgh, PA, USA, March 22-26, 2026. pages 1301-1315, ACM, 2026. [doi]

Abstract

Abstract is missing.