On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

Tianchu Ji, Shraddhan Jain, Michael Ferdman, Peter A. Milder, H. Andrew Schwartz, Niranjan Balasubramanian. On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers. In Chengqing Zong, Fei Xia, Wenjie Li 0002, Roberto Navigli, editors, Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021. pages 4147-4157, Association for Computational Linguistics, 2021. [doi]

Authors

Tianchu Ji

This author has not been identified. Look up 'Tianchu Ji' in Google

Shraddhan Jain

This author has not been identified. Look up 'Shraddhan Jain' in Google

Michael Ferdman

This author has not been identified. Look up 'Michael Ferdman' in Google

Peter A. Milder

This author has not been identified. Look up 'Peter A. Milder' in Google

H. Andrew Schwartz

This author has not been identified. Look up 'H. Andrew Schwartz' in Google

Niranjan Balasubramanian

This author has not been identified. Look up 'Niranjan Balasubramanian' in Google