Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B. Kirk, Wen-mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Siddhartha Chatterjee, Michael L. Scott, editors, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2008, Salt Lake City, UT, USA, February 20-23, 2008. pages 73-82, ACM, 2008. [doi]