EReinit: Scalable and efficient fault-tolerance for bulk-synchronous MPI applications

Sourav Chakraborty 0003, Ignacio Laguna, Murali Emani, Kathryn Mohror, Dhabaleswar K. Panda, Martin Schulz 0001, Hari Subramoni. EReinit: Scalable and efficient fault-tolerance for bulk-synchronous MPI applications. Concurrency - Practice and Experience, 32(3), 2020. [doi]

Authors

Sourav Chakraborty 0003

This author has not been identified. Look up 'Sourav Chakraborty 0003' in Google

Ignacio Laguna

This author has not been identified. Look up 'Ignacio Laguna' in Google

Murali Emani

This author has not been identified. Look up 'Murali Emani' in Google

Kathryn Mohror

This author has not been identified. Look up 'Kathryn Mohror' in Google

Dhabaleswar K. Panda

This author has not been identified. Look up 'Dhabaleswar K. Panda' in Google

Martin Schulz 0001

This author has not been identified. Look up 'Martin Schulz 0001' in Google

Hari Subramoni

This author has not been identified. Look up 'Hari Subramoni' in Google