DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification

Daegun Yoon, Sangyoon Oh 0001. DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification. In Proceedings of the 52nd International Conference on Parallel Processing, ICPP 2023, Salt Lake City, UT, USA, August 7-10, 2023. pages 746-755, ACM, 2023. [doi]

@inproceedings{Yoon023,
  title = {DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification},
  author = {Daegun Yoon and Sangyoon Oh 0001},
  year = {2023},
  doi = {10.1145/3605573.3605609},
  url = {https://doi.org/10.1145/3605573.3605609},
  researchr = {https://researchr.org/publication/Yoon023},
  cites = {0},
  citedby = {0},
  pages = {746-755},
  booktitle = {Proceedings of the 52nd International Conference on Parallel Processing, ICPP 2023, Salt Lake City, UT, USA, August 7-10, 2023},
  publisher = {ACM},
}