Improving GPGPU Performance via Cache Locality Aware Thread Block Scheduling

Li-Jhan Chen, Hsiang-Yun Cheng, Po-Han Wang, Chia-Lin Yang. Improving GPGPU Performance via Cache Locality Aware Thread Block Scheduling. Computer Architecture Letters, 16(2):127-131, 2017. [doi]