The following publications are possibly variants of this publication:
- GradSA: Gradient Sparsification and Accumulation for Communication-Efficient Distributed Deep LearningBo Liu, Wenbin Jiang, Shaofeng Zhao, Hai Jin 0001, Bingsheng He. GPC 2020: 77-91 [doi]
- MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep LearningShaohuai Shi, Xiaowen Chu, Bo Li. tpds, 32(8):1903-1917, 2021. [doi]
- Gradient Sparsification for Communication-Efficient Distributed OptimizationJianqiao Wangni, Jialei Wang, Ji Liu, Tong Zhang. nips 2018: 1306-1316 [doi]
- OASR-WFBP: An overlapping aware start-up sharing gradient merging strategy for efficient communication in distributed deep learningYingjie Song, Zhuo Tang, Yaohua Wang, Xiong Xiao, Zhizhong Liu, Jing Xia, Kenli Li 0001. jpdc, 196:104997, 2025. [doi]
- Layer-Wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence GuaranteesShaohuai Shi, Zhenheng Tang, Qiang Wang 0022, Kaiyong Zhao, Xiaowen Chu. ecai 2020: 1467-1474 [doi]
- Communication Usage Optimization of Gradient Sparsification with Aggregation in Deep LearningSheng-Ping Wang, Pangfeng Liu, Jan-Jan Wu. icncc 2018: 22-26 [doi]
- SSD-SGD: Communication Sparsification for Distributed Deep Learning TrainingYemao Xu, Dezun Dong, Dongsheng Wang, Shi Xu, Enda Yu, Weixia Xu, Xiangke Liao. taco, 20(1), March 2023. [doi]
- A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient SparsificationShaohuai Shi, Kaiyong Zhao, Qiang Wang, Zhenheng Tang, Xiaowen Chu. IJCAI 2019: 3411-3417 [doi]