Abstract is missing.
- Multi-workgroup Tiling to Improve the Locality of Explicit One-Step Methods for ODE Systems with Limited Access Distance on GPUsMatthias Korch, Tim Werner. 3-12 [doi]
- Structure-Aware Calculation of Many-Electron Wave Function Overlaps on Multicore ProcessorsDavor Davidovic, Enrique S. Quintana-Ortí. 13-24 [doi]
- Lazy Stencil Integration in Multigrid AlgorithmsCharles D. Murray, Tobias Weinzierl. 25-37 [doi]
- High Performance Tensor-Vector Multiplication on Shared-Memory SystemsFilip Pawlowski, Bora Uçar, Albert-Jan Yzelman. 38-48 [doi]
- Efficient Modular Squaring in Binary Fields on CPU Supporting AVX and GPUPawel Augustynowicz, Andrzej Paszkiewicz. 49-57 [doi]
- Parallel Robust Computation of Generalized Eigenvectors of Matrix PencilsCarl Christian Kjelgaard Mikkelsen, Mirko Myllykoski. 58-69 [doi]
- Introduction to StarNEig - A Task-Based Library for Solving Nonsymmetric Eigenvalue ProblemsMirko Myllykoski, Carl Christian Kjelgaard Mikkelsen. 70-81 [doi]
- Robust Task-Parallel Solution of the Triangular Sylvester EquationAngelika Beatrix Schwarz, Carl Christian Kjelgaard Mikkelsen. 82-92 [doi]
- Vectorized Parallel Solver for Tridiagonal Toeplitz Systems of Linear EquationsBeata Dmitruk, Przemyslaw Stpiczynski. 93-103 [doi]
- Parallel Performance of an Iterative Solver Based on the Golub-Kahan BidiagonalizationCarola Kruse, Masha Sosonkina, Mario Arioli, Nicolas Tardieu, Ulrich Rüde. 104-116 [doi]
- A High-Performance Implementation of a Robust Preconditioner for Heterogeneous ProblemsLinus Seelinger, Anne Reinarz, Robert Scheichl. 117-128 [doi]
- Hybrid Solver for Quasi Block Diagonal Linear SystemsViviana Arrigoni, Annalisa Massini. 129-140 [doi]
- Parallel Adaptive Cross Approximation for the Multi-trace Formulation of Scattering ProblemsMichal Kravcenko, Jan Zapletal, Xavier Claeys, Michal Merta. 141-150 [doi]
- Implementation of Parallel 3-D Real FFT with 2-D Decomposition on Intel Xeon Phi ClustersDaisuke Takahashi. 151-161 [doi]
- Exploiting Symmetries of Small Prime-Sized DFTsDoru-Thom Popovici, Devangi N. Parikh, Daniele G. Spampinato, Tze Meng Low. 162-173 [doi]
- Parallel Computations for Various Scalarization Schemes in Multicriteria Optimization ProblemsVictor Gergel, Evgeny Kozinov. 174-184 [doi]
- Early Performance Assessment of the ThunderX2 Processor for Lattice Based SimulationsEnrico Calore, Alessandro Gabbana, Fabio Rinaldi, Sebastiano Fabio Schifano, Raffaele Tripiccione. 187-198 [doi]
- An Area Efficient and Reusable HEVC 1D-DCT Hardware AcceleratorMate Cobrnic, Alen Duspara, Leon Dragic, Igor Piljic, Hrvoje Mlinaric, Mario Kovac. 199-208 [doi]
- Improving Locality-Aware Scheduling with Acyclic Directed Graph PartitioningM. Yusuf Özkaya, Anne Benoit, Ümit V. Çatalyürek. 211-223 [doi]
- Isoefficiency Maps for Divisible Computations in Hierarchical Memory SystemsMaciej Drozdowski, Gaurav Singh, Jedrzej M. Marszalkowski. 224-234 [doi]
- OpenMP Target Device Offloading for the SX-Aurora TSUBASA Vector EngineTim Cramer, Manoel Römmer, Boris Kosmynin, Erich Focht, Matthias S. Müller. 237-249 [doi]
- On the Road to DiPOSH: Adventures in High-Performance OpenSHMEMCamille Coti, Allen D. Malony. 250-260 [doi]
- Click-Fraud Detection for Online AdvertisingRoman Wiatr, Vladyslav Lyutenko, Milosz Demczuk, Renata Slota, Jacek Kitowski. 261-271 [doi]
- Parallel Graph Partitioning Optimization Under PEGASUS DA Application Global State MonitoringAdam Smyk, Marek Tudruj, Lukasz Grochal. 272-286 [doi]
- Cloud Infrastructure Automation for Scientific WorkflowsBartosz Balis, Michal Orzechowski, Krystian Pawlik, Maciej Pawlik, Maciej Malawski. 287-297 [doi]
- Posit NPB: Assessing the Precision Improvement in HPC Scientific ApplicationsSteven Wei Der Chien, Ivy Bo Peng, Stefano Markidis. 301-310 [doi]
- A High-Order Discontinuous Galerkin Solver with Dynamic Adaptive Mesh Refinement to Simulate Cloud Formation ProcessesLukas Krenz, Leonhard Rannabauer, Michael Bader. 311-323 [doi]
- Performance and Portability of State-of-Art Molecular Dynamics Software on Modern GPUsEvgeny Kuznetsov, Nikolay Kondratyuk, Mikhail Logunov, Vsevolod P. Nikolskiy, Vladimir V. Stegailov. 324-334 [doi]
- Exploiting Parallelism on Shared Memory in the QED Particle-in-Cell Code PICADOR with Greedy Load BalancingIosif Meyerov, Alexander Panov, Sergei Bastrakov, Aleksei Bashinov, Evgeny Efimenko, Elena Panova, Igor Surmin, Valentin Volokitin, Arkady Gonoskov. 335-347 [doi]
- Parallelized Construction of Extension Velocities for the Level-Set MethodMichael Quell, Paul Manstetten, Andreas Hössinger, Siegfried Selberherr, Josef Weinbub. 348-358 [doi]
- Relative Expression Classification Tree. A Preliminary GPU-Based ImplementationMarcin Czajkowski, Krzysztof Jurczuk, Marek Kretowski. 359-369 [doi]
- Performance Optimizations for Parallel Modeling of Solidification with Dynamic Intensity of ComputationKamil Halbiniak, Lukasz Szustak, Adam Kulawik, Pawel Gepner. 370-381 [doi]
- SIMD-node Transformations for Non-blocking Data StructuresJoel Fuentes, Wei-Yu Chen, Guei-Yuan Lueh, Arturo Garza, Isaac D. Scherson. 385-395 [doi]
- Stained Glass Image Generation Using Voronoi Diagram and Its GPU AccelerationHironobu Kobayashi, Yasuaki Ito, Koji Nakano. 396-407 [doi]
- Modifying Queries Strategy for Graph-Based Speculative Query Execution for RDBMSAnna Sasak-Okon. 408-418 [doi]
- Accelerating GPU-based Evolutionary Induction of Decision Trees - Fitness Evaluation ReuseKrzysztof Jurczuk, Marcin Czajkowski, Marek Kretowski. 421-431 [doi]
- A Distributed Modular Scalable and Generic Framework for Parallelizing Population-Based MetaheuristicsHatem Khalloof, Phil Ostheimer, Wilfried Jakob, Shadi Shahoud, Clemens Düpmeier, Veit Hagenmeyer. 432-444 [doi]
- Parallel Processing of Images Represented by Linguistic Description in DatabasesDanuta Rutkowska, Krzysztof Wiaderek. 445-456 [doi]
- An OpenMP Parallelization of the K-means Algorithm Accelerated Using KD-treesWojciech Kwedlo, Michal Lubowicz. 457-466 [doi]
- Evaluating the Use of Policy Gradient Optimization Approach for Automatic Cloud Resource ProvisioningWlodzimierz Funika, Pawel Koperek. 467-478 [doi]
- Improving Efficiency of Automatic Labeling by Image Transformations on CPU and GPULukasz Karbowiak. 479-490 [doi]
- Efficient Triangular Matrix Vector Multiplication on the GPUTakahiro Inoue, Hiroki Tokura, Koji Nakano, Yasuaki Ito. 493-504 [doi]
- Performance Engineering for a Tall & Skinny Matrix Multiplication Kernels on GPUsDominik Ernst, Georg Hager, Jonas Thies, Gerhard Wellein. 505-515 [doi]
- Reproducible BLAS Routines with Tunable Accuracy Using Ozaki Scheme for Many-Core ArchitecturesDaichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki. 516-527 [doi]
- Portable Monte Carlo Transport Performance Evaluation in the PATMOS PrototypeTao Chang, Emeric Brun, Christophe Calvin. 528-539 [doi]
- Multifrontal Non-negative Matrix FactorizationPiyush Sao, Ramakrishnan Kannan. 543-554 [doi]
- Preconditioned Jacobi SVD Algorithm Outperforms PDGESVDMartin Becka, Gabriel Oksa. 555-566 [doi]
- A Parallel Factorization for Generating Orthogonal MatricesMarek Parfieniuk. 567-578 [doi]