Time Machine: Generative Real-Time Model for Failure (and Lead Time) Prediction in HPC Systems

Khalid Ayedh Alharthi, Arshad Jhumka, Sheng Di, Lin Gui, Franck Cappello, Simon McIntosh-Smith. Time Machine: Generative Real-Time Model for Failure (and Lead Time) Prediction in HPC Systems. In 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network, DSN 2023, Porto, Portugal, June 27-30, 2023. pages 508-521, IEEE, 2023. [doi]

@inproceedings{AlharthiJDGCM23,
  title = {Time Machine: Generative Real-Time Model for Failure (and Lead Time) Prediction in HPC Systems},
  author = {Khalid Ayedh Alharthi and Arshad Jhumka and Sheng Di and Lin Gui and Franck Cappello and Simon McIntosh-Smith},
  year = {2023},
  doi = {10.1109/DSN58367.2023.00054},
  url = {https://doi.org/10.1109/DSN58367.2023.00054},
  researchr = {https://researchr.org/publication/AlharthiJDGCM23},
  cites = {0},
  citedby = {0},
  pages = {508-521},
  booktitle = {53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network, DSN 2023, Porto, Portugal, June 27-30, 2023},
  publisher = {IEEE},
  isbn = {979-8-3503-4793-7},
}