Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems

Haryadi S. Gunawi, Riza O. Suminto, Russell Sears, Casey Golliher, Swaminathan Sundararaman, Xing Lin, Tim Emami, Weiguang Sheng, Nematollah Bidokhti, Caitie McCaffrey, Deepthi Srinivasan, Biswaranjan Panda, Andrew Baptist, Gary Grider, Parks M. Fields, Kevin Harms, Robert B. Ross, Andree Jacobson, Robert Ricci, Kirk Webb, Peter Alvaro, H. Birali Runesha, Mingzhe Hao, Huaicheng Li. Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems. TOS, 14(3), 2018. [doi]

@article{GunawiSSGSLESBM18-0,
  title = {Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems},
  author = {Haryadi S. Gunawi and Riza O. Suminto and Russell Sears and Casey Golliher and Swaminathan Sundararaman and Xing Lin and Tim Emami and Weiguang Sheng and Nematollah Bidokhti and Caitie McCaffrey and Deepthi Srinivasan and Biswaranjan Panda and Andrew Baptist and Gary Grider and Parks M. Fields and Kevin Harms and Robert B. Ross and Andree Jacobson and Robert Ricci and Kirk Webb and Peter Alvaro and H. Birali Runesha and Mingzhe Hao and Huaicheng Li},
  year = {2018},
  url = {https://dl.acm.org/citation.cfm?id=3242086},
  researchr = {https://researchr.org/publication/GunawiSSGSLESBM18-0},
  cites = {0},
  citedby = {0},
  journal = {TOS},
  volume = {14},
  number = {3},
}