Efficient Management and Intelligent Fault Tolerance for HPC Interconnect Networks

Jijun Cao, Mingche Lai, Zhang Luo, Jiaqing Xu, Zhengbin Pang. Efficient Management and Intelligent Fault Tolerance for HPC Interconnect Networks. In 25th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2019, Tianjin, China, December 4-6, 2019. pages 343-351, IEEE, 2019. [doi]

Authors

Jijun Cao

This author has not been identified. Look up 'Jijun Cao' in Google

Mingche Lai

This author has not been identified. Look up 'Mingche Lai' in Google

Zhang Luo

This author has not been identified. Look up 'Zhang Luo' in Google

Jiaqing Xu

This author has not been identified. Look up 'Jiaqing Xu' in Google

Zhengbin Pang

This author has not been identified. Look up 'Zhengbin Pang' in Google