Researchr is a web site for finding, collecting, sharing, and reviewing scientific publications, for researchers by researchers.
Sign up for an account to create a profile with publication list, tag and review your related work, and share bibliographies with your co-authors.
Zhong Zhang 0004, Nian Shao, Chongming Gao, Rui Miao, Qinli Yang, Junming Shao. Mixhead: Breaking the low-rank bottleneck in multi-head attention language models. Knowl.-Based Syst., 240:108075, 2022. [doi]
Possibly Related PublicationsThe following publications are possibly variants of this publication: Breaking the Softmax Bottleneck: A High-Rank RNN Language ModelZhilin Yang, Zihang Dai, Ruslan Salakhutdinov, William W. Cohen. iclr 2018: [doi]
The following publications are possibly variants of this publication: