MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong 0004, Furu Wei. MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers. In Chengqing Zong, Fei Xia, Wenjie Li 0002, Roberto Navigli, editors, Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021. pages 2140-2151, Association for Computational Linguistics, 2021. [doi]

This author has not been identified. Look up 'Wenhui Wang' in GoogleThis author has not been identified. Look up 'Hangbo Bao' in GoogleThis author has not been identified. Look up 'Shaohan Huang' in GoogleThis author has not been identified. Look up 'Li Dong 0004' in GoogleThis author has not been identified. Look up 'Furu Wei' in Google

runs on WebDSL