Optimizing Mixture-of-Experts Inference Time via Model Deployment and Communication Scheduling

Jialong Li 0006, Shreyansh Tripathi, Lakshay Rastogi, Yiming Lei, Rui Pan 0003, Yiting Xia. Optimizing Mixture-of-Experts Inference Time via Model Deployment and Communication Scheduling. IEEE/ACM Trans. Netw., 34:2478-2497, 2026. [doi]

Abstract

Abstract is missing.