Self-attention Mechanism at the Token Level: Gradient Analysis and Algorithm Optimization

Linqing Liu, Xiaolong Xu 0002. Self-attention Mechanism at the Token Level: Gradient Analysis and Algorithm Optimization. Knowl.-Based Syst., 277:110784, October 2023. [doi]

Abstract

Abstract is missing.