HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU

Shaoyi Huang, Shiyang Chen, Hongwu Peng, Daniel Manu, Zhenglun Kong, Geng Yuan, Lei Yang, Shusen Wang, Hang Liu 0001, Caiwen Ding. HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU. In Yiran Chen, Victor V. Zhirnov, Avesta Sasan, Ioannis Savidis, editors, GLSVLSI '21: Great Lakes Symposium on VLSI 2021, Virtual Event, USA, June 22-25, 2021. pages 169-174, ACM, 2021. [doi]

@inproceedings{HuangCPMKYYW0D21,
  title = {HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU},
  author = {Shaoyi Huang and Shiyang Chen and Hongwu Peng and Daniel Manu and Zhenglun Kong and Geng Yuan and Lei Yang and Shusen Wang and Hang Liu 0001 and Caiwen Ding},
  year = {2021},
  doi = {10.1145/3453688.3461740},
  url = {https://doi.org/10.1145/3453688.3461740},
  researchr = {https://researchr.org/publication/HuangCPMKYYW0D21},
  cites = {0},
  citedby = {0},
  pages = {169-174},
  booktitle = {GLSVLSI '21: Great Lakes Symposium on VLSI 2021, Virtual Event, USA, June 22-25, 2021},
  editor = {Yiran Chen and Victor V. Zhirnov and Avesta Sasan and Ioannis Savidis},
  publisher = {ACM},
  isbn = {978-1-4503-8393-6},
}