Identifying the Right Replication Level to Detect and Correct Silent Errors at Scale

Anne Benoit, Aurélien Cavelan, Franck Cappello, Padma Raghavan, Yves Robert, Hongynag Sun. Identifying the Right Replication Level to Detect and Correct Silent Errors at Scale. In Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale, FTXS@HPDC 2017, Washington, DC, USA, June, 2017. pages 31-38, ACM, 2017. [doi]

Abstract

Abstract is missing.