SGD with Large Step Sizes Learns Sparse Features

Maksym Andriushchenko, Aditya Vardhan Varre, Loucas Pillaud-Vivien, Nicolas Flammarion. SGD with Large Step Sizes Learns Sparse Features. In Andreas Krause 0001, Emma Brunskill, KyungHyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA. Volume 202 of Proceedings of Machine Learning Research, pages 903-925, PMLR, 2023. [doi]

@inproceedings{AndriushchenkoV23,
  title = {SGD with Large Step Sizes Learns Sparse Features},
  author = {Maksym Andriushchenko and Aditya Vardhan Varre and Loucas Pillaud-Vivien and Nicolas Flammarion},
  year = {2023},
  url = {https://proceedings.mlr.press/v202/andriushchenko23b.html},
  researchr = {https://researchr.org/publication/AndriushchenkoV23},
  cites = {0},
  citedby = {0},
  pages = {903-925},
  booktitle = {International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA},
  editor = {Andreas Krause 0001 and Emma Brunskill and KyungHyun Cho and Barbara Engelhardt and Sivan Sabato and Jonathan Scarlett},
  volume = {202},
  series = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
}