FINCH: Prompt-guided Key-Value Cache Compression for Large Language Models

Giulio Corallo, Paolo Papotti. FINCH: Prompt-guided Key-Value Cache Compression for Large Language Models. TACL, 12:1517-1532, 2024. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.