Systemic Assessment of Node Failures in HPC Production Platforms

Anwesha Das, Frank Mueller 0001, Barry Rountree. Systemic Assessment of Node Failures in HPC Production Platforms. In 35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021, Portland, OR, USA, May 17-21, 2021. pages 267-276, IEEE, 2021. [doi]

@inproceedings{Das0R21,
  title = {Systemic Assessment of Node Failures in HPC Production Platforms},
  author = {Anwesha Das and Frank Mueller 0001 and Barry Rountree},
  year = {2021},
  doi = {10.1109/IPDPS49936.2021.00035},
  url = {https://doi.org/10.1109/IPDPS49936.2021.00035},
  researchr = {https://researchr.org/publication/Das0R21},
  cites = {0},
  citedby = {0},
  pages = {267-276},
  booktitle = {35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021, Portland, OR, USA, May 17-21, 2021},
  publisher = {IEEE},
  isbn = {978-1-6654-4066-0},
}