Communication-Efficient Model Parallelism for Distributed In-Situ Transformer Inference

Yuanxin Wei, Shengyuan Ye, Jiazhi Jiang, Xu Chen, Dan Huang, Jiangsu Du, Yutong Lu. Communication-Efficient Model Parallelism for Distributed In-Situ Transformer Inference. In Design, Automation & Test in Europe Conference & Exhibition, DATE 2024, Valencia, Spain, March 25-27, 2024. pages 1-6, IEEE, 2024. [doi]

Abstract

Abstract is missing.