Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs

Matthieu Dorier, Justin M. Wozniak, Robert B. Ross. Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs. In Johan Montagnat, Ian Taylor, Sandra Gesing, Rizos Sakellariou, editors, Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science, WORKS@SC 2017, Denver, CO, USA, November 12 - 17, 2017. ACM, 2017. [doi]

Authors

Matthieu Dorier

This author has not been identified. Look up 'Matthieu Dorier' in Google

Justin M. Wozniak

This author has not been identified. Look up 'Justin M. Wozniak' in Google

Robert B. Ross

This author has not been identified. Look up 'Robert B. Ross' in Google