VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG

Junkyum Kim, Divya Mahajan 0001. VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG. In IEEE International Symposium on High Performance Computer Architecture, HPCA 2026, Sydney, Australia, January 31 - Feb. 4, 2026. pages 1-15, IEEE, 2026. [doi]

Abstract

Abstract is missing.