TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models

Zhuohan Li, Siyuan Zhuang, Shiyuan Guo, Danyang Zhuo, Hao Zhang, Dawn Song, Ion Stoica. TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models. In Marina Meila, Tong Zhang 0001, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event. Volume 139 of Proceedings of Machine Learning Research, pages 6543-6552, PMLR, 2021. [doi]

Authors

Zhuohan Li

This author has not been identified. Look up 'Zhuohan Li' in Google

Siyuan Zhuang

This author has not been identified. Look up 'Siyuan Zhuang' in Google

Shiyuan Guo

This author has not been identified. Look up 'Shiyuan Guo' in Google

Danyang Zhuo

This author has not been identified. Look up 'Danyang Zhuo' in Google

Hao Zhang

This author has not been identified. Look up 'Hao Zhang' in Google

Dawn Song

This author has not been identified. Look up 'Dawn Song' in Google

Ion Stoica

This author has not been identified. Look up 'Ion Stoica' in Google