Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling

Kyuhong Shim, Iksoo Choi, Wonyong Sung, Jungwook Choi. Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling. In 18th International SoC Design Conference, ISOCC 2021, Jeju Island, South Korea, Republic of, October 6-9, 2021. pages 357-358, IEEE, 2021. [doi]

Abstract

Abstract is missing.