The following publications are possibly variants of this publication:
- LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPUTingxing Dong, Azzam Haidar, Piotr Luszczek, James Austin Harris, Stanimire Tomov, Jack Dongarra. hpcc 2014: 157-160 [doi]
- GPU-based LU Factorization and Solve on Batches of Matrices with Band StructureAhmad Abdelfattah, Stanimire Tomov, Piotr Luszczek, Hartwig Anzt, Jack J. Dongarra. sc 2023: 1670-1679 [doi]
- Batch QR Factorization on GPUs: Design, Optimization, and TuningAhmad Abdelfattah, Stan Tomov, Jack J. Dongarra. iccs 2022: 60-74 [doi]
- Multi-GPU Implementation of LU FactorizationYulu Jia, Piotr Luszczek, Jack Dongarra. iccs 2012: 106-115 [doi]
- A Guide for Achieving High Performance with Very Small Matrices on GPU: A Case Study of Batched LU and Cholesky FactorizationsAzzam Haidar, Ahmad Abdelfattah, Mawussi Zounon, Stanimire Tomov, Jack J. Dongarra. tpds, 29(5):973-984, 2018. [doi]
- Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky FactorizationAhmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra. hpec 2018: 1-7 [doi]