Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs

Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra. Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs. In 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019, Rio de Janeiro, Brazil, May 20-24, 2019. pages 111-122, IEEE, 2019. [doi]

Authors

Ahmad Abdelfattah

This author has not been identified. Look up 'Ahmad Abdelfattah' in Google

Stanimire Tomov

This author has not been identified. Look up 'Stanimire Tomov' in Google

Jack J. Dongarra

This author has not been identified. Look up 'Jack J. Dongarra' in Google