Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference

Benjamin Hawks, Javier M. Duarte, Nicholas J. Fraser, Alessandro Pappalardo, Nhan Tran, Yaman Umuroglu. Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference. Frontiers Artif. Intell., 4:676564, 2021. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.