Hanxiao Zhang, Lin Ju, Chan Wu, Jinjing Huang, Youshao Xiao, Zhenglei Zhou, Zhiming Fan, Zhaoxin Huan, Siyuan Li, Fanzhuang Meng, Lei Liang, Xiaolu Zhang, Jun Zhou. Rethinking Memory and Communication Costs for Efficient Data Parallel Training of Large Language Models. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tomczak, Cheng Zhang 0005, editors, Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024. 2024. [doi]
Abstract is missing.