Communication-Efficient Sparsely-Activated Model Training via Sequence Migration and Token Condensation

Fahao Chen, Peng Li 0017, Zicong Hong, Zhou Su, Song Guo 0001. Communication-Efficient Sparsely-Activated Model Training via Sequence Migration and Token Condensation. IEEE/ACM Trans. Netw., 33(6):2869-2880, December 2025. [doi]

Abstract

Abstract is missing.