Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs

Simon Garcia De Gonzalo, Sitao Huang, Juan Gómez-Luna, Simon D. Hammond, Onur Mutlu, Wen-mei Hwu. Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs. In Mahmut Taylan Kandemir, Alexandra Jimborean, Tipp Moseley, editors, IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2019, Washington, DC, USA, February 16-20, 2019. pages 73-84, IEEE, 2019. [doi]

Abstract

Abstract is missing.