Using Probabilistic Characterization to Reduce Runtime Faults in HPC Systems

Jim M. Brandt, Bert J. Debusschere, Ann C. Gentile, Jackson Mayo, Philippe P. Pébay, David Thompson, Matthew Wong. Using Probabilistic Characterization to Reduce Runtime Faults in HPC Systems. In 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 19-22 May 2008, Lyon, France. pages 759-764, IEEE Computer Society, 2008. [doi]

Authors

Jim M. Brandt

This author has not been identified. Look up 'Jim M. Brandt' in Google

Bert J. Debusschere

This author has not been identified. Look up 'Bert J. Debusschere' in Google

Ann C. Gentile

This author has not been identified. Look up 'Ann C. Gentile' in Google

Jackson Mayo

This author has not been identified. Look up 'Jackson Mayo' in Google

Philippe P. Pébay

This author has not been identified. Look up 'Philippe P. Pébay' in Google

David Thompson

This author has not been identified. Look up 'David Thompson' in Google

Matthew Wong

This author has not been identified. Look up 'Matthew Wong' in Google