The following publications are possibly variants of this publication:
- Efficient Pipeline Planning for Expedited Distributed DNN TrainingZiyue Luo, Xiaodong Yi 0001, Guoping Long, Shiqing Fan, Chuan Wu 0001, Jun Yang, Wei Lin 0016. infocom 2022: 340-349 [doi]
- dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN TrainingHanpeng Hu, Chenyu Jiang, Yuchen Zhong, Yanghua Peng, Chuan Wu 0001, Yibo Zhu, Haibin Lin, Chuanxiong Guo. mlsys 2022: [doi]
- Expediting Distributed DNN Training With Device Topology-Aware Graph DeploymentShiwei Zhang, Xiaodong Yi 0001, Lansong Diao, Chuan Wu 0001, Siyu Wang, Wei Lin 0016. tpds, 34(4):1281-1293, April 2023. [doi]
- GradientFlow: Optimizing Network Performance for Large-Scale Distributed DNN TrainingPeng Sun 0006, Yonggang Wen 0001, Ruobing Han, Wansen Feng, Shengen Yan. tbd, 8(2):495-507, 2022. [doi]
- HPH: Hybrid Parallelism on Heterogeneous Clusters for Accelerating Large-scale DNNs TrainingYabo Duan, Zhiquan Lai, Shengwei Li, Weijie Liu, Keshi Ge, Peng Liang, Dongsheng Li. cluster 2022: 313-323 [doi]
- Communication Analysis for Multidimensional Parallel Training of Large-scale DNN ModelsZhiquan Lai, Yanqi Hao, Shengwei Li, Dongsheng Li 0001. hpcc 2023: 728-729 [doi]
- DistSim: A performance model of large-scale hybrid distributed DNN trainingGuandong Lu, Runzhe Chen, Yakai Wang, Yangjie Zhou 0001, Rui Zhang, Zheng Hu, Yanming Miao, Zhifang Cai, Li Li 0012, Jingwen Leng, Minyi Guo. cf 2023: 112-122 [doi]