Dynamic Load Balancing for Distributed Large Model Training: A Hybrid Framework of Gray Markov Chain and MDP

Yonggang Li, Rui Ji, Yaotong Su, Yuanjin Zhang, Andong Zhang, Longjiang Li. Dynamic Load Balancing for Distributed Large Model Training: A Hybrid Framework of Gray Markov Chain and MDP. Concurrency - Practice and Experience, 38(1), January 2026. [doi]

Abstract

Abstract is missing.