ULSeq-TA: Ultra-Long Sequence Attention Fusion Transformer Accelerator Supporting Grouped Sparse Softmax and Dual-Path Sparse LayerNorm

Jingyu Wang, Lu Zhang, Xueqing Li, Huazhong Yang, Yongpan Liu. ULSeq-TA: Ultra-Long Sequence Attention Fusion Transformer Accelerator Supporting Grouped Sparse Softmax and Dual-Path Sparse LayerNorm. IEEE Trans. on CAD of Integrated Circuits and Systems, 43(3):892-905, March 2024. [doi]

Abstract

Abstract is missing.