Time Machine: Generative Real-Time Model for Failure (and Lead Time) Prediction in HPC Systems

Khalid Ayedh Alharthi, Arshad Jhumka, Sheng Di, Lin Gui, Franck Cappello, Simon McIntosh-Smith. Time Machine: Generative Real-Time Model for Failure (and Lead Time) Prediction in HPC Systems. In 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network, DSN 2023, Porto, Portugal, June 27-30, 2023. pages 508-521, IEEE, 2023. [doi]

Authors

Khalid Ayedh Alharthi

This author has not been identified. Look up 'Khalid Ayedh Alharthi' in Google

Arshad Jhumka

This author has not been identified. Look up 'Arshad Jhumka' in Google

Sheng Di

This author has not been identified. Look up 'Sheng Di' in Google

Lin Gui

This author has not been identified. Look up 'Lin Gui' in Google

Franck Cappello

This author has not been identified. Look up 'Franck Cappello' in Google

Simon McIntosh-Smith

This author has not been identified. Look up 'Simon McIntosh-Smith' in Google