Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference

Benjamin Hawks, Javier M. Duarte, Nicholas J. Fraser, Alessandro Pappalardo, Nhan Tran, Yaman Umuroglu. Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference. Frontiers Artif. Intell., 4:676564, 2021. [doi]

Abstract

Abstract is missing.