Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC Systems

Esteban Meneses, Greg Bronevetsky, Laxmikant V. Kalé. Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC Systems. In 25th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2011, Anchorage, Alaska, USA, 16-20 May 2011 - Workshop Proceedings. pages 1533-1540, IEEE, 2011. [doi]

Abstract

Abstract is missing.