Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

Noam Shazeer, Mitchell Stern. Adafactor: Adaptive Learning Rates with Sublinear Memory Cost. In Jennifer G. Dy, Andreas Krause 0001, editors, Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018. Volume 80 of JMLR Workshop and Conference Proceedings, pages 4603-4611, JMLR.org, 2018. [doi]

Authors

Noam Shazeer

This author has not been identified. Look up 'Noam Shazeer' in Google

Mitchell Stern

This author has not been identified. Look up 'Mitchell Stern' in Google