FINCH: Prompt-guided Key-Value Cache Compression for Large Language Models

Giulio Corallo, Paolo Papotti. FINCH: Prompt-guided Key-Value Cache Compression for Large Language Models. TACL, 12:1517-1532, 2024. [doi]

Abstract

Abstract is missing.