ZeRO: memory optimizations toward training trillion parameter models

Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He. ZeRO: memory optimizations toward training trillion parameter models. In Christine Cuicchi, Irene Qualters, William T. Kramer, editors, SC '20: The International Conference for High Performance Computing, Networking, Storage and Analysis, Virtual Event / Atlanta, Georgia, USA, November 9-19, 2020. pages 20, IEEE/ACM, 2020. [doi]

Abstract

Abstract is missing.