3: Hybrid Architecture Using High Bandwidth Memory and High Bandwidth Flash for Cost-Efficient LLM Inference

Minho Ha, Euiseok Kim, Hoshik Kim. 3: Hybrid Architecture Using High Bandwidth Memory and High Bandwidth Flash for Cost-Efficient LLM Inference. Computer Architecture Letters, 25(1):49-52, January - June 2026. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.