Adaptive gradient sparsification with layer and stage-wise for accelerating distributed DNN training

Waixi Liu 0001, Jun Cai 0002, Yue Yin, Zhen-xin Zhang, Kongyang Chen, Jian-Tao Fu, Wen-Li Shang. Adaptive gradient sparsification with layer and stage-wise for accelerating distributed DNN training. Computer Networks, 276:111983, 2026. [doi]

Abstract

Abstract is missing.