Efficient LLM Inference via Chunked Prefills

Arney Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, Ramachandran Ramjee. Efficient LLM Inference via Chunked Prefills. Operating Systems Review, 59(1):9-16, July 2025. [doi]

Abstract

Abstract is missing.