Journal: Parallel Computing

Volume 38, Issue 9

465 -- 484Basilio B. Fraguela, Ganesh Bikshandi, Jia Guo, María Jesús Garzarán, David A. Padua, Christoph von Praun. Optimization techniques for efficient HTA programs
485 -- 500Takeshi Iwashita, Yu Hirotani, Takeshi Mifune, Toshio Murayama, Hideki Ohtani. Large-scale time-harmonic electromagnetic field analysis using a multigrid solver on a distributed memory parallel computer
501 -- 517Amit Amritkar, Danesh Tafti, Rui Liu, Rick Kufrin, Barbara M. Chapman. OpenMP parallelism for fluid and fluid-particulate systems
518 -- 532Wlodzimierz Bielecki, Marek Palkowski, Tomasz Klimek. Free scheduling for statement instances of parameterized arbitrarily nested affine loops

Volume 38, Issue 8

343 -- 0Volodymyr V. Kindratenko, Gregory D. Peterson. Application accelerators in HPC - Editorial introduction
344 -- 364Andrew G. Schmidt, Siddhartha Datta, Ashwin A. Mendon, Ron Sass. Investigation into scaling I/O bound streaming applications productively with an all-FPGA cluster
365 -- 390Frederico Pratas, Pedro Trancoso, Leonel Sousa, Alexandros Stamatakis, Guochun Shi, Volodymyr V. Kindratenko. Fine-grain parallelism using multi-core, Cell/BE, and GPU Systems
391 -- 407Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory D. Peterson, Jack Dongarra. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming
408 -- 420Francisco Vázquez, José-Jesús Fernández, Ester M. Garzón. Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach
421 -- 437Depeng Yang, Gregory D. Peterson, Husheng Li. Compressed sensing and Cholesky decomposition on FPGAs and GPUs
438 -- 464John Robert Wernsing, Greg Stitt. Elastic computing: A portable optimization framework for hybrid computers

Volume 38, Issue 6-7

277 -- 288Francisco Argüello, Dora Blanco Heras, Montserrat Bóo, Julián Lamas-Rodriguez. The split-and-merge method in general purpose computation on GPUs
289 -- 309Timothy D. R. Hartley, Erik Saule, Ümit V. Çatalyürek. Improving performance of adaptive component-based dataflow middleware
310 -- 328Peng Di, Hui Wu, Jingling Xue, Feng Wang, Canqun Yang. Parallelizing SOR for GPGPUs using alternate loop tiling
329 -- 341Rahul Nagpal, Anasua Bhowmik. Criticality guided energy aware speculation for speculative multithreaded processors

Volume 38, Issue 4-5

175 -- 193Minhaj Ahmad Khan. Scheduling for heterogeneous Systems using constrained critical paths
194 -- 225Kathryn Mohror, Karen L. Karavanic. Trace profiling: Scalable event tracing on high-end parallel systems
226 -- 244Gerassimos D. Barlas. Cluster-based optimized parallel video transcoding
245 -- 259Hasan Metin Aktulga, Joseph C. Fogarty, S. A. Pandit, Ananth Grama. Parallel reactive molecular dynamics: Numerical methods and algorithmic techniques
260 -- 276Roman Wyrzykowski, Krzysztof Rojek, Lukasz Szustak. Model-driven adaptation of double-precision matrix multiplication to the Cell processor architecture

Volume 38, Issue 3

91 -- 110Lucas Mello Schnorr, Guillaume Huard, Philippe Olivier Alexandre Navaux. A hierarchical aggregation model to achieve visualization scalability in the analysis of parallel applications
111 -- 124Holger Scherl, Markus Kowarschik, Hannes G. Hofmann, Benjamin Keck, Joachim Hornegger. Evaluation of state-of-the-art hardware architectures for fast cone-beam CT reconstruction
125 -- 139Andreu Moreno, Eduardo César, Andreu Guevara, Joan Sorribes, Tomàs Margalef. Load balancing in homogeneous pipeline based applications
140 -- 156Aleksandr Ovcharenko, Daniel Ibanez, Fabien Delalondre, Onkar Sahni, Kenneth E. Jansen, Christopher D. Carothers, Mark S. Shephard. Neighborhood communication paradigm to increase scalability in large-scale dynamic scientific applications
157 -- 174Andreas Klöckner, Nicolas Pinto, Yunsup Lee, Bryan C. Catanzaro, Paul Ivanov, Ahmed Fasih. PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation

Volume 38, Issue 12

595 -- 614Madan Sathe, Olaf Schenk, Helmar Burkhart. An auction-based weighted matching implementation on massively parallel architectures
615 -- 630Maja Etinski, Julita Corbalán, Jesús Labarta, Mateo Valero. Parallel job scheduling for power constrained HPC systems

Volume 38, Issue 10-11

533 -- 551Yong Chen, Huaiyu Zhu, Hui Jin, Xian-He Sun. Algorithm-level Feedback-controlled Adaptive data prefetcher: Accelerating data access for high-performance processors
552 -- 575Mickeal Verschoor, Andrei C. Jalba. Analysis and performance estimation of the Conjugate Gradient method on multiple GPUs
576 -- 594Ümit V. Çatalyürek, John Feo, Assefaw Hadish Gebremedhin, Mahantesh Halappanavar, Alex Pothen. Graph coloring algorithms for multi-core and massively multithreaded architectures

Volume 38, Issue 1-2

1 -- 0Torsten Hoefler. Extensions for next-generation parallel programming models
2 -- 14Nick Rutar, Jeffrey K. Hollingsworth. Data centric techniques for mapping performance data to program variables
15 -- 25Joshua Hursey, Richard L. Graham. Analyzing fault aware collective performance in a process fault tolerant MPI
26 -- 36Jesper Larsson Träff. Alternative, uniformly expressive and more scalable interfaces for collective communication in MPI
37 -- 51George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Hérault, Pierre Lemarinier, Jack Dongarra. DAGuE: A generic distributed DAG engine for High Performance Computing
52 -- 65Martin Sandrieser, Siegfried Benkner, Sabri Pllana. Using explicit platform descriptions to support programming of heterogeneous many-core systems
66 -- 74Phil Miller, Aaron Becker, Laxmikant V. Kalé. Using shared arrays in message-driven parallel programs
75 -- 89Pieter Hijma, Rob van Nieuwpoort, Ceriel J. H. Jacobs, Henri E. Bal. Generating synchronization statements in divide-and-conquer programs