On the Use of Cluster-Based Partial Message Logging to Improve Fault Tolerance for MPI HPC Applications

Thomas Ropars, Amina Guermouche, Bora Uçar, Esteban Meneses, Laxmikant V. Kalé, Franck Cappello. On the Use of Cluster-Based Partial Message Logging to Improve Fault Tolerance for MPI HPC Applications. In Emmanuel Jeannot, Raymond Namyst, Jean Roman, editors, Euro-Par 2011 Parallel Processing - 17th International Conference, Euro-Par 2011, Bordeaux, France, August 29 - September 2, 2011, Proceedings, Part I. Volume 6852 of Lecture Notes in Computer Science, pages 567-578, Springer, 2011. [doi]

Possibly Related Publications

The following publications are possibly variants of this publication: