Accelerating Distributed Training in Heterogeneous Clusters via a Straggler-Aware Parameter Server

Huihuang Yu, Zongwei Zhu, Xianglan Chen, Yuming Cheng, Yahui Hu, Xi Li 0003. Accelerating Distributed Training in Heterogeneous Clusters via a Straggler-Aware Parameter Server. In Zheng Xiao, Laurence T. Yang, Pavan Balaji, Tao Li, Keqin Li 0001, Albert Y. Zomaya, editors, 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, Zhangjiajie, China, August 10-12, 2019. pages 200-207, IEEE, 2019. [doi]

Abstract

Abstract is missing.