Memorize Step by Step: Efficient Long-Context Prefilling with Incremental Memory and Decremental Chunk

Zhiyuan Zeng, Qipeng Guo, Xiaoran Liu, Zhangyue Yin, Wentao Shu, Mianqiu Huang, Bo Wang, Yunhua Zhou, Linlin Li 0008, Qun Liu 0001, Xipeng Qiu. Memorize Step by Step: Efficient Long-Context Prefilling with Incremental Memory and Decremental Chunk. In Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen, editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Miami, FL, USA, November 12-16, 2024. pages 21021-21034, Association for Computational Linguistics, 2024. [doi]

Abstract

Abstract is missing.