Implementation and Evaluation of a Scalable Application-Level Checkpoint-Recovery Scheme for MPI Programs

Martin Schulz, Greg Bronevetsky, Rohit Fernandes, Daniel Marques, Keshav Pingali, Paul Stodghill. Implementation and Evaluation of a Scalable Application-Level Checkpoint-Recovery Scheme for MPI Programs. In Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 6-12 November 2004, Pittsburgh, PA, USA, CD-Rom. pages 38, IEEE Computer Society, 2004. [doi]

@inproceedings{SchulzBFMPS04,
  title = {Implementation and Evaluation of a Scalable Application-Level Checkpoint-Recovery Scheme for MPI Programs},
  author = {Martin Schulz and Greg Bronevetsky and Rohit Fernandes and Daniel Marques and Keshav Pingali and Paul Stodghill},
  year = {2004},
  doi = {10.1145/1048933.1049982},
  url = {http://doi.acm.org/10.1145/1048933.1049982},
  researchr = {https://researchr.org/publication/SchulzBFMPS04},
  cites = {0},
  citedby = {0},
  pages = {38},
  booktitle = {Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 6-12 November 2004, Pittsburgh, PA, USA, CD-Rom},
  publisher = {IEEE Computer Society},
  isbn = {0-7695-2153-3},
}