Optimizing distributed training deployment in heterogeneous GPU clusters

Xiaodong Yi 0001, Shiwei Zhang, Ziyue Luo, Guoping Long, Lansong Diao, Chuan Wu, Zhen Zheng, Jun Yang, Wei Lin. Optimizing distributed training deployment in heterogeneous GPU clusters. In Dongsu Han, Anja Feldmann, editors, CoNEXT '20: The 16th International Conference on emerging Networking EXperiments and Technologies, Barcelona, Spain, December, 2020. pages 93-107, ACM, 2020. [doi]

Abstract

Abstract is missing.