A quantitative analysis of fault tolerance mechanisms for parallel machine learning systems with parameter servers

Mingxi Li, Yusuke Tanimura, Hidemoto Nakada. A quantitative analysis of fault tolerance mechanisms for parallel machine learning systems with parameter servers. In Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication, IMCOM 2017, Beppu, Japan, January 5-7, 2017. pages 69, ACM, 2017. [doi]

Abstract

Abstract is missing.