The following publications are possibly variants of this publication:
- Adaptive Attention for Sparse-based Long-sequence TransformerXuanyu Zhang, Zhepeng Lv, Qing Yang. acl 2023: 8602-8610 [doi]
- Hybrid regularization for compressed sensing MRI: Exploiting shearlet transform and group-sparsity total variationRyan Wen Liu, Lin Shi, Simon C. H. Yu, Defeng Wang. fusion 2017: 1-8 [doi]
- SOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer InferenceWenxun Wang, Shuchang Zhou 0001, Wenyu Sun, Peiqin Sun, Yongpan Liu. iccad 2023: 1-9 [doi]
- SALO: an efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequencesGuan Shen, Jieru Zhao, Quan Chen, Jingwen Leng, Chao Li, Minyi Guo. dac 2022: 571-576 [doi]
- Sparse Transformer Hawkes Process for Long Event SequencesZhuoqun Li, Mingxuan Sun. pkdd 2023: 172-188 [doi]
- Bidformer: A Transformer-Based Model via Bidirectional Sparse Self-Attention Mechanism for Long Sequence Time-Series ForecastingWei Li, Xiangxu Meng, Chuhan Chen, Hailin Mi, Huiqiang Wang. SMC 2023: 4076-4082 [doi]
- Long-range Sequence Modeling with Predictable Sparse AttentionYimeng Zhuang, Jing Zhang, Mei Tu. acl 2022: 234-243 [doi]
- Sparse Mix-Attention Transformer for Multispectral Image and Hyperspectral Image FusionShihai Yu, Xu Zhang, Huihui Song. remotesensing, 16(1):144, January 2024. [doi]