RT-LM: Uncertainty-Aware Resource Management for Real-Time Inference of Language Models

Yufei Li, Zexin Li, Wei Yang 0013, Cong Liu. RT-LM: Uncertainty-Aware Resource Management for Real-Time Inference of Language Models. In IEEE Real-Time Systems Symposium, RTSS 2023, Taipei, Taiwan, December 5-8, 2023. pages 158-171, IEEE, 2023. [doi]

Abstract

Abstract is missing.