Transformer++: a long sequence modeling method based on direction-aware dual attention and multi-head sampling

Ruiqin Wang, Qishun Ji, Zhenzhen Sheng, Yang Qi. Transformer++: a long sequence modeling method based on direction-aware dual attention and multi-head sampling. Appl. Intell., 55(17):1103, November 2025. [doi]

@article{WangJSQ25,
  title = {Transformer++: a long sequence modeling method based on direction-aware dual attention and multi-head sampling},
  author = {Ruiqin Wang and Qishun Ji and Zhenzhen Sheng and Yang Qi},
  year = {2025},
  month = {November},
  doi = {10.1007/s10489-025-06965-6},
  url = {https://doi.org/10.1007/s10489-025-06965-6},
  researchr = {https://researchr.org/publication/WangJSQ25},
  cites = {0},
  citedby = {0},
  journal = {Appl. Intell.},
  volume = {55},
  number = {17},
  pages = {1103},
}