An Allreduce Algorithm and Network Co-design for Large-Scale Training of Distributed Deep Learning

Truong Thao Nguyen, Mohamed Wahib. An Allreduce Algorithm and Network Co-design for Large-Scale Training of Distributed Deep Learning. In Laurent Lefèvre, Stacy Patterson, Young Choon Lee, Haiying Shen, Shashikant Ilager, Mohammad Goudarzi, Adel Nadjaran Toosi, Rajkumar Buyya, editors, 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021, Melbourne, Australia, May 10-13, 2021. pages 396-405, IEEE, 2021. [doi]

Abstract

Abstract is missing.