Achieving Target MTTF by Duplicating Reliability-Critical Components in High Performance Computing Systems

Nithin Nakka, Alok N. Choudhary, Gary Grider, John Bent, James Nunez, Satsangat Khalsa. Achieving Target MTTF by Duplicating Reliability-Critical Components in High Performance Computing Systems. In 25th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2011, Anchorage, Alaska, USA, 16-20 May 2011 - Workshop Proceedings. pages 1567-1576, IEEE, 2011. [doi]

Abstract

Abstract is missing.