The following publications are possibly variants of this publication:
- Design and Implementation of a Highly Efficient DGEMM for 64-Bit ARMv8 Multi-core ProcessorsFeng Wang, Hao Jiang, Ke Zuo, Xing Su, Jingling Xue, Canqun Yang. icpp 2015: 200-209 [doi]
- Stencil Computations on HPC-oriented ARMv8 64-Bit Multi-Core ProcessorChunjiang Li, Yushan Dong, Kuan Li. ica3pp 2015: 30-43 [doi]
- Towards Highly Efficient DGEMM on the Emerging SW26010 Many-Core ProcessorLijuan Jiang, Chao Yang, Yulong Ao, Wanwang Yin, Wenjing Ma, Qiao Sun, Fangfang Liu, Rongfen Lin, Peng Zhang. icpp 2017: 422-431 [doi]
- Efficient NTTRU Implementation on ARMv8Zhuo Zhang, Jieyu Zheng, Yunlei Zhao. icpads 2023: 2793-2794 [doi]