Tao Luo, Kelvin K. W. Ng, Zhen Ping Khor, Sidharth Sankhe, Boon Thau Loo, Vincent Liu 0001. Multiplexed Heterogeneous LLM Serving via Stage-Aligned Parallelism. In Proceedings of the 2025 ACM Symposium on Cloud Computing, SoCC 2025, Online, USA, November 19-21, 2025. pages 735-747, ACM, 2025. [doi]
Abstract is missing.