An Explainable Model for Fault Detection in HPC Systems

Marrin Molan, Andrea Borghesi, Francesco Beneventi, Massimiliano Guarrasi, Andrea Bartolini. An Explainable Model for Fault Detection in HPC Systems. In Heike Jagode, Hartwig Anzt, Hatem Ltaief, Piotr Luszczek, editors, High Performance Computing - ISC High Performance Digital 2021 International Workshops, Frankfurt am Main, Germany, June 24 - July 2, 2021, Revised Selected Papers. Volume 12761 of Lecture Notes in Computer Science, pages 378-391, Springer, 2021. [doi]

Abstract

Abstract is missing.