Shfl-BW: accelerating deep neural network inference with tensor-core aware weight pruning

Guyue Huang, Haoran Li, Minghai Qin, Fei Sun, Yufei Ding, Yuan Xie. Shfl-BW: accelerating deep neural network inference with tensor-core aware weight pruning. In Rob Oshana, editor, DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10 - 14, 2022. pages 1153-1158, ACM, 2022. [doi]

Abstract

Abstract is missing.