Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding

Tianyu Zhang, Kaige Liu, Jack Kosaian, Juncheng Yang, Rashmi Vinayak. Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding. PVLDB, 16(11):3137-3150, 2023. [doi]

Abstract

Abstract is missing.