Optimizing distributed training deployment in heterogeneous GPU clusters

Xiaodong Yi 0001, Shiwei Zhang, Ziyue Luo, Guoping Long, Lansong Diao, Chuan Wu, Zhen Zheng, Jun Yang, Wei Lin. Optimizing distributed training deployment in heterogeneous GPU clusters. In Dongsu Han, Anja Feldmann, editors, CoNEXT '20: The 16th International Conference on emerging Networking EXperiments and Technologies, Barcelona, Spain, December, 2020. pages 93-107, ACM, 2020. [doi]

Authors

Xiaodong Yi 0001

This author has not been identified. Look up 'Xiaodong Yi 0001' in Google

Shiwei Zhang

This author has not been identified. Look up 'Shiwei Zhang' in Google

Ziyue Luo

This author has not been identified. Look up 'Ziyue Luo' in Google

Guoping Long

This author has not been identified. Look up 'Guoping Long' in Google

Lansong Diao

This author has not been identified. Look up 'Lansong Diao' in Google

Chuan Wu

This author has not been identified. Look up 'Chuan Wu' in Google

Zhen Zheng

This author has not been identified. Look up 'Zhen Zheng' in Google

Jun Yang

This author has not been identified. Look up 'Jun Yang' in Google

Wei Lin

This author has not been identified. Look up 'Wei Lin' in Google