Accelerating Distributed Training in Heterogeneous Clusters via a Straggler-Aware Parameter Server

Huihuang Yu, Zongwei Zhu, Xianglan Chen, Yuming Cheng, Yahui Hu, Xi Li 0003. Accelerating Distributed Training in Heterogeneous Clusters via a Straggler-Aware Parameter Server. In Zheng Xiao, Laurence T. Yang, Pavan Balaji, Tao Li, Keqin Li 0001, Albert Y. Zomaya, editors, 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, Zhangjiajie, China, August 10-12, 2019. pages 200-207, IEEE, 2019. [doi]

@inproceedings{YuZCCH019,
  title = {Accelerating Distributed Training in Heterogeneous Clusters via a Straggler-Aware Parameter Server},
  author = {Huihuang Yu and Zongwei Zhu and Xianglan Chen and Yuming Cheng and Yahui Hu and Xi Li 0003},
  year = {2019},
  doi = {10.1109/HPCC/SmartCity/DSS.2019.00042},
  url = {https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00042},
  researchr = {https://researchr.org/publication/YuZCCH019},
  cites = {0},
  citedby = {0},
  pages = {200-207},
  booktitle = {21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, Zhangjiajie, China, August 10-12, 2019},
  editor = {Zheng Xiao and Laurence T. Yang and Pavan Balaji and Tao Li and Keqin Li 0001 and Albert Y. Zomaya},
  publisher = {IEEE},
  isbn = {978-1-7281-2058-4},
}