A reliability-aware approach for an optimal checkpoint/restart model in HPC environments

Yudan Liu, Raja Nassar, Chokchai Leangsuksun, Nichamon Naksinehaboon, Mihaela Paun, Stephen L. Scott. A reliability-aware approach for an optimal checkpoint/restart model in HPC environments. In Proceedings of the 2007 IEEE International Conference on Cluster Computing, 17-20 September 2007, Austin, Texas, USA. pages 452-457, IEEE, 2007. [doi]

Authors

Yudan Liu

This author has not been identified. Look up 'Yudan Liu' in Google

Raja Nassar

This author has not been identified. Look up 'Raja Nassar' in Google

Chokchai Leangsuksun

This author has not been identified. Look up 'Chokchai Leangsuksun' in Google

Nichamon Naksinehaboon

This author has not been identified. Look up 'Nichamon Naksinehaboon' in Google

Mihaela Paun

This author has not been identified. Look up 'Mihaela Paun' in Google

Stephen L. Scott

This author has not been identified. Look up 'Stephen L. Scott' in Google