Serving Long-Context LLMs at the Mobile Edge: Test-Time Reinforcement Learning-Based Model Caching and Inference Offloading

Minrui Xu, Dusit Niyato, Christopher G. Brinton. Serving Long-Context LLMs at the Mobile Edge: Test-Time Reinforcement Learning-Based Model Caching and Inference Offloading. IEEE/ACM Trans. Netw., 34:3808-3823, 2026. [doi]

Abstract

Abstract is missing.