MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism

Zheng Zhang, Donglin Yang, Yaqi Xia, Liang Ding 0006, Dacheng Tao, Xiaobo Zhou, Dazhao Cheng. MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism. In IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023, St. Petersburg, FL, USA, May 15-19, 2023. pages 167-177, IEEE, 2023. [doi]

Abstract

Abstract is missing.