ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers

Zhewei Yao, Reza Yazdani Aminabadi, Minjia Zhang, Xiaoxia Wu, Conglong Li, Yuxiong He. ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. 2022. [doi]

Authors

Zhewei Yao

This author has not been identified. Look up 'Zhewei Yao' in Google

Reza Yazdani Aminabadi

This author has not been identified. Look up 'Reza Yazdani Aminabadi' in Google

Minjia Zhang

This author has not been identified. Look up 'Minjia Zhang' in Google

Xiaoxia Wu

This author has not been identified. Look up 'Xiaoxia Wu' in Google

Conglong Li

This author has not been identified. Look up 'Conglong Li' in Google

Yuxiong He

This author has not been identified. Look up 'Yuxiong He' in Google