VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Steve Dai, Rangharajan Venkatesan, Mark Ren, Brian Zimmer, William J. Dally, Brucek Khailany. VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference. In Alex Smola, Alex Dimakis, Ion Stoica, editors, Proceedings of Machine Learning and Systems 2021, MLSys 2021, virtual, April 5-9, 2021. mlsys.org, 2021. [doi]

Abstract

Abstract is missing.