Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices

Enrico Reggiani, Alessandro Pappalardo, Max Doblas, Miquel Moretó, Mauro Olivieri, Osman Sabri Unsal, Adrián Cristal. Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2023, Montreal, QC, Canada, February 25 - March 1, 2023. pages 1085-1098, IEEE, 2023. [doi]

Abstract

Abstract is missing.