A Principled Approach to HPC Event Monitoring

Alireza Goudarzi, Dorian C. Arnold, Darko Stefanovic, Kurt B. Ferreira, Guy Feldman. A Principled Approach to HPC Event Monitoring. In Nathan DeBardeleben, Franck Cappello, Robert L. Clay, editors, Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, FTXS 2015, Portland, Oregon, USA, June 15, 2015. pages 3-10, ACM, 2015. [doi]

Abstract

Abstract is missing.