Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices

Enrico Reggiani, Alessandro Pappalardo, Max Doblas, Miquel Moretó, Mauro Olivieri, Osman Sabri Unsal, Adrián Cristal. Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2023, Montreal, QC, Canada, February 25 - March 1, 2023. pages 1085-1098, IEEE, 2023. [doi]

@inproceedings{ReggianiPDMOUC23,
  title = {Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices},
  author = {Enrico Reggiani and Alessandro Pappalardo and Max Doblas and Miquel Moretó and Mauro Olivieri and Osman Sabri Unsal and Adrián Cristal},
  year = {2023},
  doi = {10.1109/HPCA56546.2023.10071076},
  url = {https://doi.org/10.1109/HPCA56546.2023.10071076},
  researchr = {https://researchr.org/publication/ReggianiPDMOUC23},
  cites = {0},
  citedby = {0},
  pages = {1085-1098},
  booktitle = {IEEE International Symposium on High-Performance Computer Architecture, HPCA 2023, Montreal, QC, Canada, February 25 - March 1, 2023},
  publisher = {IEEE},
  isbn = {978-1-6654-7652-2},
}