Transformer++: a long sequence modeling method based on direction-aware dual attention and multi-head sampling

Ruiqin Wang, Qishun Ji, Zhenzhen Sheng, Yang Qi. Transformer++: a long sequence modeling method based on direction-aware dual attention and multi-head sampling. Appl. Intell., 55(17):1103, November 2025. [doi]

Abstract

Abstract is missing.