APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes

Yuanxin Wei, Jiangsu Du, Jiazhi Jiang, Xiao Shi, XianWei Zhang, Dan Huang, Nong Xiao, Yutong Lu. APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2024, Atlanta, GA, USA, November 17-22, 2024. pages 90, IEEE, 2024. [doi]

Abstract

Abstract is missing.