Addressing Straggler Problem Through Dynamic Partial All-Reduce for Distributed Deep Learning in Heterogeneous GPU Clusters

HyungJun Kim, Chunggeon Song, HwaMin Lee, HeonChang Yu. Addressing Straggler Problem Through Dynamic Partial All-Reduce for Distributed Deep Learning in Heterogeneous GPU Clusters. In IEEE International Conference on Consumer Electronics, ICCE 2023, Las Vegas, NV, USA, January 6-8, 2023. pages 1-6, IEEE, 2023. [doi]

Abstract

Abstract is missing.