Yue Peng, Richard O. Sinnott. A Hybrid Forecasting and Reinforcement Learning Approach for Latency-Constrained Scaling of Generative AI Systems. In Proceedings of the 18th IEEE/ACM International Conference on Utility and Cloud Computing, UCC 2025, Nantes, France, December 1-4, 2025. ACM, 2025. [doi]
Abstract is missing.