Toward Cost-Efficient LLM Serving: A System-Level Memory Optimization Approach

Geunsik Lim. Toward Cost-Efficient LLM Serving: A System-Level Memory Optimization Approach. In 23rd Consumer Communications & Networking Conference, CCNC 2026, Las Vegas, NV, USA, January 9-12, 2026. pages 1-8, IEEE, 2026. [doi]

Abstract

Abstract is missing.