FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Approach for Heterogeneous Edge Devices

Yuhao Chen, Qianqian Yang 0002, Shibo He, Zhiguo Shi, Jiming Chen 0001, Mohsen Guizani. FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Approach for Heterogeneous Edge Devices. IEEE Trans. Mob. Comput., 23(4):3200-3212, April 2024. [doi]

Abstract

Abstract is missing.