TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for Inference Cost Reduction

Junyi Liu, LiangZhi Li, Tong Xiang, Bowen Wang, Yiming Qian. TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for Inference Cost Reduction. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 9796-9810, Association for Computational Linguistics, 2023. [doi]

Authors

Junyi Liu

This author has not been identified. Look up 'Junyi Liu' in Google

LiangZhi Li

This author has not been identified. Look up 'LiangZhi Li' in Google

Tong Xiang

This author has not been identified. Look up 'Tong Xiang' in Google

Bowen Wang

This author has not been identified. Look up 'Bowen Wang' in Google

Yiming Qian

This author has not been identified. Look up 'Yiming Qian' in Google