Failures in large scale systems: long-term measurement, analysis, and implications

Saurabh Gupta, Tirthak Patel, Christian Engelmann, Devesh Tiwari. Failures in large scale systems: long-term measurement, analysis, and implications. In Bernd Mohr, Padma Raghavan, editors, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO, USA, November 12 - 17, 2017. pages 44, ACM, 2017. [doi]

Abstract

Abstract is missing.