VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Steve Dai, Rangharajan Venkatesan, Mark Ren, Brian Zimmer, William J. Dally, Brucek Khailany. VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference. In Alex Smola, Alex Dimakis, Ion Stoica, editors, Proceedings of Machine Learning and Systems 2021, MLSys 2021, virtual, April 5-9, 2021. mlsys.org, 2021. [doi]

Authors

Steve Dai

This author has not been identified. Look up 'Steve Dai' in Google

Rangharajan Venkatesan

This author has not been identified. Look up 'Rangharajan Venkatesan' in Google

Mark Ren

This author has not been identified. Look up 'Mark Ren' in Google

Brian Zimmer

This author has not been identified. Look up 'Brian Zimmer' in Google

William J. Dally

This author has not been identified. Look up 'William J. Dally' in Google

Brucek Khailany

This author has not been identified. Look up 'Brucek Khailany' in Google