Gradient-based Intra-attention Pruning on Pre-trained Language Models

Ziqing Yang 0001, Yiming Cui, Xin Yao, Shijin Wang. Gradient-based Intra-attention Pruning on Pre-trained Language Models. In Anna Rogers, Jordan L. Boyd-Graber, Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023. pages 2775-2790, Association for Computational Linguistics, 2023. [doi]

@inproceedings{0001CYW23,
  title = {Gradient-based Intra-attention Pruning on Pre-trained Language Models},
  author = {Ziqing Yang 0001 and Yiming Cui and Xin Yao and Shijin Wang},
  year = {2023},
  url = {https://aclanthology.org/2023.acl-long.156},
  researchr = {https://researchr.org/publication/0001CYW23},
  cites = {0},
  citedby = {0},
  pages = {2775-2790},
  booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023},
  editor = {Anna Rogers and Jordan L. Boyd-Graber and Naoaki Okazaki},
  publisher = {Association for Computational Linguistics},
  isbn = {978-1-959429-72-2},
}