Checkpointing Orchestration: Toward a Scalable HPC Fault-Tolerant Environment

Hui Jin, Tao Ke, Yong Chen, Xian-He Sun. Checkpointing Orchestration: Toward a Scalable HPC Fault-Tolerant Environment. In 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012, Ottawa, Canada, May 13-16, 2012. pages 276-283, IEEE, 2012. [doi]

Authors

Hui Jin

This author has not been identified. Look up 'Hui Jin' in Google

Tao Ke

This author has not been identified. Look up 'Tao Ke' in Google

Yong Chen

This author has not been identified. Look up 'Yong Chen' in Google

Xian-He Sun

This author has not been identified. Look up 'Xian-He Sun' in Google