Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference

Benjamin Hawks, Javier M. Duarte, Nicholas J. Fraser, Alessandro Pappalardo, Nhan Tran, Yaman Umuroglu. Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference. Frontiers Artif. Intell., 4:676564, 2021. [doi]

@article{HawksDFPTU21,
  title = {Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference},
  author = {Benjamin Hawks and Javier M. Duarte and Nicholas J. Fraser and Alessandro Pappalardo and Nhan Tran and Yaman Umuroglu},
  year = {2021},
  doi = {10.3389/frai.2021.676564},
  url = {https://doi.org/10.3389/frai.2021.676564},
  researchr = {https://researchr.org/publication/HawksDFPTU21},
  cites = {0},
  citedby = {0},
  journal = {Frontiers Artif. Intell.},
  volume = {4},
  pages = {676564},
}