On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

Tianchu Ji, Shraddhan Jain, Michael Ferdman, Peter A. Milder, H. Andrew Schwartz, Niranjan Balasubramanian. On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers. In Chengqing Zong, Fei Xia, Wenjie Li 0002, Roberto Navigli, editors, Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021. pages 4147-4157, Association for Computational Linguistics, 2021. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.