A tunable holistic resiliency approach for high-performance computing systems

Stephen L. Scott, Christian Engelmann, Geoffroy Vallée, Thomas Naughton, Anand Tikotekar, George Ostrouchov, Chokchai Leangsuksun, Nichamon Naksinehaboon, Raja Nassar, Mihaela Paun, Frank Mueller, Chao Wang, Arun Babu Nagarajan, Jyothish Varma. A tunable holistic resiliency approach for high-performance computing systems. In Daniel A. Reed, Vivek Sarkar, editors, Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2009, Raleigh, NC, USA, February 14-18, 2009. pages 305-306, ACM, 2009. [doi]

Abstract

Abstract is missing.