Cooperative checkpointing: a robust approach to large-scale systems reliability

Adam J. Oliner, Larry Rudolph, Ramendra K. Sahoo. Cooperative checkpointing: a robust approach to large-scale systems reliability. In Gregory K. Egan, Yoichi Muraoka, editors, Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006, Cairns, Queensland, Australia, June 28 - July 01, 2006. pages 14-23, ACM, 2006. [doi]

Abstract

Abstract is missing.