Kair: A Statistical and Causal Approach to Pinpointing Stragglers in Distributed Model Training

Yitang Yang, Junhong Liu, Jiapeng Chen, Xiaoyang Sun, Tianyu Wo, Chunming Hu, Chengru Song, Jin Ouyang, Renyu Yang. Kair: A Statistical and Causal Approach to Pinpointing Stragglers in Distributed Model Training. In 40th IEEE/ACM International Conference on Automated Software Engineering, ASE 2025, Seoul, Korea, Republic of, November 16-20, 2025. pages 3754-3759, IEEE, 2025. [doi]

Authors

Yitang Yang

This author has not been identified. Look up 'Yitang Yang' in Google

Junhong Liu

This author has not been identified. Look up 'Junhong Liu' in Google

Jiapeng Chen

This author has not been identified. Look up 'Jiapeng Chen' in Google

Xiaoyang Sun

This author has not been identified. Look up 'Xiaoyang Sun' in Google

Tianyu Wo

This author has not been identified. Look up 'Tianyu Wo' in Google

Chunming Hu

This author has not been identified. Look up 'Chunming Hu' in Google

Chengru Song

This author has not been identified. Look up 'Chengru Song' in Google

Jin Ouyang

This author has not been identified. Look up 'Jin Ouyang' in Google

Renyu Yang

This author has not been identified. Look up 'Renyu Yang' in Google