Failure recovery for bulk synchronous applications with MPI stages

Nawrin Sultana, Martin Rüfenacht, Anthony Skjellum, Ignacio Laguna, Kathryn Mohror. Failure recovery for bulk synchronous applications with MPI stages. Parallel Computing, 84:1-14, 2019. [doi]

Authors

Nawrin Sultana

This author has not been identified. Look up 'Nawrin Sultana' in Google

Martin Rüfenacht

This author has not been identified. Look up 'Martin Rüfenacht' in Google

Anthony Skjellum

This author has not been identified. Look up 'Anthony Skjellum' in Google

Ignacio Laguna

This author has not been identified. Look up 'Ignacio Laguna' in Google

Kathryn Mohror

This author has not been identified. Look up 'Kathryn Mohror' in Google