HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU

Shaoyi Huang, Shiyang Chen, Hongwu Peng, Daniel Manu, Zhenglun Kong, Geng Yuan, Lei Yang, Shusen Wang, Hang Liu 0001, Caiwen Ding. HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU. In Yiran Chen, Victor V. Zhirnov, Avesta Sasan, Ioannis Savidis, editors, GLSVLSI '21: Great Lakes Symposium on VLSI 2021, Virtual Event, USA, June 22-25, 2021. pages 169-174, ACM, 2021. [doi]

Authors

Shaoyi Huang

This author has not been identified. Look up 'Shaoyi Huang' in Google

Shiyang Chen

This author has not been identified. Look up 'Shiyang Chen' in Google

Hongwu Peng

This author has not been identified. Look up 'Hongwu Peng' in Google

Daniel Manu

This author has not been identified. Look up 'Daniel Manu' in Google

Zhenglun Kong

This author has not been identified. Look up 'Zhenglun Kong' in Google

Geng Yuan

This author has not been identified. Look up 'Geng Yuan' in Google

Lei Yang

This author has not been identified. Look up 'Lei Yang' in Google

Shusen Wang

This author has not been identified. Look up 'Shusen Wang' in Google

Hang Liu 0001

This author has not been identified. Look up 'Hang Liu 0001' in Google

Caiwen Ding

This author has not been identified. Look up 'Caiwen Ding' in Google