An optimal checkpoint/restart model for a large scale high performance computing system

Yudan Liu, Raja Nassar, Chokchai Leangsuksun, Nichamon Naksinehaboon, Mihaela Paun, Stephen L. Scott. An optimal checkpoint/restart model for a large scale high performance computing system. In 22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, Miami, Florida USA, April 14-18, 2008. pages 1-9, IEEE, 2008. [doi]

Abstract

Abstract is missing.