Xinwei Fu, Zhen Zhang, Haozheng Fan, Guangtai Huang, Mohammad El-Shabani, Randy Huang, Rahul Solanki, Fei Wu, Ron Diamant, Yida Wang 0003. Distributed Training of Large Language Models on AWS Trainium. In Proceedings of the 2024 ACM Symposium on Cloud Computing, SoCC 2024, Redmond, WA, USA, November 20-22, 2024. pages 961-976, ACM, 2024. [doi]
No references recorded for this publication.
No citations of this publication recorded.