Abstract is missing.
- A comparative investigation of device-specific mechanisms for exploiting HPC acceleratorsAyman Tarakji, Lukas Börger, Rainer Leupers. 1-12 [doi]
- GPU-SM: shared memory multi-GPU programmingJavier Cabezas, Marc Jordà, Isaac Gelado, Nacho Navarro, Wen-mei W. Hwu. 13-24 [doi]
- Adaptive GPU cache bypassingYingying Tian, Sooraj Puthoor, Joseph L. Greathouse, Bradford M. Beckmann, Daniel A. Jiménez. 25-35 [doi]
- Efficient utilization of GPGPU cache hierarchyMahmoud Khairy, Mohamed Zahran, Amr G. Wassal. 36-47 [doi]
- Effects of source-code optimizations on GPU performance and energy consumptionJared Coplin, Martin Burtscher. 48-58 [doi]
- Optimization for performance and energy for batched matrix computations on GPUsAzzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra. 59-69 [doi]
- Helium: a transparent inter-kernel optimizer for OpenCLThibaut Lutz, Christian Fensch, Murray Cole. 70-80 [doi]
- Stochastic gradient descent on GPUsRashid Kaleem, Sreepathi Pai, Keshav Pingali. 81-89 [doi]
- High performance computing of fiber scattering simulationLeiming Yu, Yan Zhang, Xiang Gong, Nilay Roy, Lee Makowski, David R. Kaeli. 90-98 [doi]
- Rethinking the parallelization of random-restart hill climbing: a case study in optimizing a 2-opt TSP solver for GPU executionMolly A. O'Neil, Martin Burtscher. 99-108 [doi]
- Forma: a DSL for image processing applications to target GPUs and multi-core CPUsMahesh Ravishankar, Justin Holewinski, Vinod Grover. 109-120 [doi]