Top-KAST: Top-K Always Sparse Training

Siddhant M. Jayakumar, Razvan Pascanu, Jack W. Rae, Simon Osindero, Erich Elsen. Top-KAST: Top-K Always Sparse Training. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. 2020. [doi]

Authors

Siddhant M. Jayakumar

This author has not been identified. Look up 'Siddhant M. Jayakumar' in Google

Razvan Pascanu

This author has not been identified. Look up 'Razvan Pascanu' in Google

Jack W. Rae

This author has not been identified. Look up 'Jack W. Rae' in Google

Simon Osindero

This author has not been identified. Look up 'Simon Osindero' in Google

Erich Elsen

This author has not been identified. Look up 'Erich Elsen' in Google