3: Hybrid Architecture Using High Bandwidth Memory and High Bandwidth Flash for Cost-Efficient LLM Inference

Minho Ha, Euiseok Kim, Hoshik Kim. 3: Hybrid Architecture Using High Bandwidth Memory and High Bandwidth Flash for Cost-Efficient LLM Inference. Computer Architecture Letters, 25(1):49-52, January - June 2026. [doi]

Abstract

Abstract is missing.