Self-attention Mechanism at the Token Level: Gradient Analysis and Algorithm Optimization

Linqing Liu, Xiaolong Xu 0002. Self-attention Mechanism at the Token Level: Gradient Analysis and Algorithm Optimization. Knowl.-Based Syst., 277:110784, October 2023. [doi]

Authors

Linqing Liu

This author has not been identified. Look up 'Linqing Liu' in Google

Xiaolong Xu 0002

This author has not been identified. Look up 'Xiaolong Xu 0002' in Google