Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs

Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra. Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs. In 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019, Rio de Janeiro, Brazil, May 20-24, 2019. pages 111-122, IEEE, 2019. [doi]

Abstract

Abstract is missing.