Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Chenyu Jiang, Ye Tian, Zhen Jia 0001, Shuai Zheng 0004, Chuan Wu 0001, Yida Wang 0003. Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping. In Phillip B. Gibbons, Gennady Pekhimenko, Christopher De Sa, editors, Proceedings of the Seventh Annual Conference on Machine Learning and Systems, MLSys 2024, Santa Clara, CA, USA, May 13-16, 2024. mlsys.org, 2024. [doi]

This author has not been identified. Look up 'Chenyu Jiang' in GoogleThis author has not been identified. Look up 'Ye Tian' in GoogleThis author has not been identified. Look up 'Zhen Jia 0001' in GoogleThis author has not been identified. Look up 'Shuai Zheng 0004' in GoogleThis author has not been identified. Look up 'Chuan Wu 0001' in GoogleThis author has not been identified. Look up 'Yida Wang 0003' in Google

runs on WebDSL