Self-attention Mechanism at the Token Level: Gradient Analysis and Algorithm Optimization

Linqing Liu, Xiaolong Xu 0002. Self-attention Mechanism at the Token Level: Gradient Analysis and Algorithm Optimization. Knowl.-Based Syst., 277:110784, October 2023. [doi]

@article{LiuX23-11,
  title = {Self-attention Mechanism at the Token Level: Gradient Analysis and Algorithm Optimization},
  author = {Linqing Liu and Xiaolong Xu 0002},
  year = {2023},
  month = {October},
  doi = {10.1016/j.knosys.2023.110784},
  url = {https://doi.org/10.1016/j.knosys.2023.110784},
  researchr = {https://researchr.org/publication/LiuX23-11},
  cites = {0},
  citedby = {0},
  journal = {Knowl.-Based Syst.},
  volume = {277},
  pages = {110784},
}