Aqua: Network-Accelerated Memory Offloading for LLMs in Scale-Up GPU Domains

Abhishek Vijaya Kumar, Gianni Antichi, Rachee Singh. Aqua: Network-Accelerated Memory Offloading for LLMs in Scale-Up GPU Domains. In Lieven Eeckhout, Georgios Smaragdakis, Katai Liang, Adrian Sampson, Martha A. Kim, Christopher J. Rossbach, editors, Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ASPLOS 2025, Rotterdam, Netherlands, 30 March 2025 - 3 April 2025. pages 48-62, ACM, 2025. [doi]

Abstract

Abstract is missing.