Scalable Distributed Training of Recommendation Models: An ASTRA-SIM + NS3 case-study with TCP/IP transport

Saeed Rashidi, Pallavi Shurpali, Srinivas Sridharan 0002, Naader Hassani, Dheevatsa Mudigere, Krishnakumar Nair, Misha Smelyanski, Tushar Krishna. Scalable Distributed Training of Recommendation Models: An ASTRA-SIM + NS3 case-study with TCP/IP transport. In IEEE Symposium on High-Performance Interconnects, HOTI 2020, Piscataway, NJ, USA, August 19-21, 2020. pages 33-42, IEEE, 2020. [doi]

@inproceedings{RashidiS0HMNSK20,
  title = {Scalable Distributed Training of Recommendation Models: An ASTRA-SIM + NS3 case-study with TCP/IP transport},
  author = {Saeed Rashidi and Pallavi Shurpali and Srinivas Sridharan 0002 and Naader Hassani and Dheevatsa Mudigere and Krishnakumar Nair and Misha Smelyanski and Tushar Krishna},
  year = {2020},
  doi = {10.1109/HOTI51249.2020.00020},
  url = {https://doi.org/10.1109/HOTI51249.2020.00020},
  researchr = {https://researchr.org/publication/RashidiS0HMNSK20},
  cites = {0},
  citedby = {0},
  pages = {33-42},
  booktitle = {IEEE Symposium on High-Performance Interconnects, HOTI 2020, Piscataway, NJ, USA, August 19-21, 2020},
  publisher = {IEEE},
  isbn = {978-1-7281-9589-6},
}