An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs

Shixiong Xu, David Gregg. An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs. In 2015 International Conference on Parallel Architecture and Compilation, PACT 2015, San Francisco, CA, USA, October 18-21, 2015. pages 488-489, IEEE, 2015. [doi]

Abstract

Abstract is missing.