Abstract is missing.
- Resampling with Feedback - A New Paradigm of Using Workload Data for Performance EvaluationDror G. Feitelson. 3-21 [doi]
- Scheduling DAGs Opportunistically: The Dream and the Reality Circa 2016Arnold L. Rosenberg. 22-33 [doi]
- Synchronization Debugging of Hybrid Parallel ProgramsOlaf Krzikalla, Ralph Müller-Pfefferkorn, Wolfgang E. Nagel. 37-50 [doi]
- Nasty-MPI: Debugging Synchronization Errors in MPI-3 One-Sided ApplicationsRoger Kowalewski, Karl Fürlinger. 51-62 [doi]
- Automatic Benchmark Profiling Through Advanced Trace AnalysisAlexis Martin, Vania Marangozova-Martin. 63-74 [doi]
- Addressing Materials Science Challenges Using GPU-accelerated POWER8 NodesPaul F. Baumeister, Marcel Bornemann, Markus Bühler, Thorsten Hater, Benjamin Krill, Dirk Pleiter, Rudolf Zeller. 77-89 [doi]
- Performance Prediction and Ranking of SpMV Kernels on GPU ArchitecturesChristoph Lehnert, Rudolf Berrendorf, Jan P. Ecker, Florian Mannuss. 90-102 [doi]
- The Impact of Voltage-Frequency Scaling for the Matrix-Vector Product on the IBM POWER8Sandra Catalán, A. Cristiano I. Malossi, Costas Bekas, Enrique S. Quintana-Ortí. 103-116 [doi]
- Power Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC SupercomputerAlina Sîrbu, Özalp Babaoglu. 117-130 [doi]
- Controlling and Assessing Correlations of Cost Matrices in Heterogeneous SchedulingLouis-Claude Canon, Pierre-Cyrille Héam, Laurent Philippe. 133-145 [doi]
- Penalized Graph Partitioning for Static and Dynamic Load BalancingTim Kiefer, Dirk Habich, Wolfgang Lehner. 146-158 [doi]
- Non-preemptive Scheduling with Setup Times: A PTASKlaus Jansen, Felix Land. 159-170 [doi]
- Cuboid Partitioning for Parallel Matrix Multiplication on Heterogeneous PlatformsOlivier Beaumont, Lionel Eyraud-Dubois, Thomas Lambert. 171-182 [doi]
- HeSP: A Simulation Framework for Solving the Task Scheduling-Partitioning Problem on Heterogeneous ArchitecturesAnton Rey, Francisco D. Igual, Manuel Prieto-Matías. 183-195 [doi]
- FPT Approximation Algorithm for Scheduling with Memory ConstraintsEric Angel, Cédric Chevalier, Franck Ledoux, Sébastien Morais, Damien Regnault. 196-208 [doi]
- Scheduling MapReduce Jobs Under Multi-round PrecedencesDimitris Fotakis, Ioannis Milis, Orestis Papadigenopoulos, Vasilis Vassalos, Georgios Zois. 209-222 [doi]
- Code Bones: Fast and Flexible Code Generation for Dynamic and Speculative Polyhedral OptimizationJuan Manuel Martinez Caamaño, Willy Wolff, Philippe Clauss. 225-237 [doi]
- Piecewise Holistic Autotuning of Compiler and Runtime ParametersMihail Popov, Chadi Akel, William Jalby, Pablo de Oliveira Castro. 238-250 [doi]
- Insights into the Fallback Path of Best-Effort Hardware Transactional Memory SystemsRicardo Quislant, Eladio Gutiérrez, Emilio L. Zapata, Oscar G. Plata. 251-263 [doi]
- Portable SIMD Performance with OpenMP* 4.x Compiler DirectivesFlorian Wende, Matthias Noack, Thomas Steinke, Michael Klemm, Chris J. Newburn, Georg Zitzlsberger. 264-277 [doi]
- Lightweight Multi-language Bindings for Apache SparkLuca Salucci, Daniele Bonetta, Walter Binder. 281-292 [doi]
- Toward a General I/O Arbitration Framework for netCDF Based Big Data ProcessingJianwei Liao, Balazs Gerofi, Guo-Yuan Lien, Seiya Nishizawa, Takemasa Miyoshi, Hirofumi Tomita, Yutaka Ishikawa. 293-305 [doi]
- High Performance Parallel Summed-Area Table Kernels for Multi-core and Many-core SystemsAngelos Papatriantafyllou, Dimitris Sacharidis. 306-318 [doi]
- GraphIn: An Online High Performance Incremental Graph Processing FrameworkDipanjan Sengupta, Narayanan Sundaram, Xia Zhu, Theodore L. Willke, Jeffrey Young, Matthew Wolf, Karsten Schwan. 319-333 [doi]
- Efficient Large Outer Joins over MapReduceLong Cheng, Spyros Kotoulas. 334-346 [doi]
- Slurm-V: Extending Slurm for Building Efficient HPC Cloud with SR-IOV and IVShmemJie Zhang, Xiaoyi Lu, Sourav Chakraborty 0003, Dhabaleswar K. Panda. 349-362 [doi]
- An Autonomic Parallel Strategy for the Projection of Ecological Niche Models in Heterogeneous Computational EnvironmentsFernanda G. O. Passos, Vinod E. F. Rebello. 363-375 [doi]
- Towards Network-Aware Service Placement in Community Network Micro-CloudsMennan Selimi, Davide Vega, Felix Freitag, Luís Veiga. 376-388 [doi]
- Heating as a Cloud-Service, A Position Paper (Industrial Presentation)Yanik Ngoko. 389-401 [doi]
- Design and Verification of Distributed PhasersKarthik Murthy, Sri Raj Paul, Kuldeep S. Meel, Tiago Cogumbreiro, John M. Mellor-Crummey. 405-418 [doi]
- Exploring Partial Replication to Improve Lightweight Silent Data Corruption Detection for HPC ApplicationsEduardo Berrocal, Leonardo Bautista-Gomez, Sheng Di, Zhiling Lan, Franck Cappello. 419-430 [doi]
- Automatic Verification of Self-consistent MPI Performance GuidelinesSascha Hunold, Alexandra Carpen-Amarie, Felix Donatus Lübbe, Jesper Larsson Träff. 433-446 [doi]
- ParallelME: A Parallel Mobile Engine to Explore Heterogeneity in Mobile Computing ArchitecturesGuilherme Andrade, Wilson de Carvalho, Renato Utsch, Pedro Caldeira, Alberto Alburquerque, Fabricio Ferracioli, Leonardo C. da Rocha, Michael Frank, Dorgival O. Guedes, Renato Ferreira. 447-459 [doi]
- CBPQ: High Performance Lock-Free Priority QueueAnastasia Braginsky, Nachshon Cohen, Erez Petrank. 460-474 [doi]
- Redesigning Triangular Dense Matrix Computations on GPUsAli Charara, Hatem Ltaief, David E. Keyes. 477-489 [doi]
- A Sharing-Aware Memory Management Unit for Online Mapping in Multi-core ArchitecturesEduardo H. M. Cruz, Matthias Diener, Laércio Lima Pilla, Philippe O. A. Navaux. 490-501 [doi]
- GreenBST: Energy-Efficient Concurrent Search TreeIbrahim Umar, Otto J. Anshus, Phuong Hoai Ha. 502-517 [doi]
- HAP: A Heterogeneity-Conscious Runtime System for Adaptive Pipeline ParallelismJinsu Park, Woongki Baek. 518-530 [doi]
- Using Data Dependencies to Improve Task-Based Scheduling Strategies on NUMA ArchitecturesPhilippe Virouleau, François Broquedis, Thierry Gautier, Fabrice Rastello. 531-544 [doi]
- Multicore vs Manycore: The Energy Cost of ConcurrencyMartin Groen, Vincent Gramoli. 545-557 [doi]
- Work-Efficient Parallel Union-Find with Applications to Incremental Graph ConnectivityNatcha Simsiri, Kanat Tangwongsan, Srikanta Tirthapura, Kun-Lung Wu. 561-573 [doi]
- An Efficient Cache-oblivious Parallel Viterbi AlgorithmRezaul Chowdhury, Pramod Ganapathi, Vivek Pradhan, Jesmin Jahan Tithi, Yunpeng Xiao. 574-587 [doi]
- Gradual Stabilization Under \tau -DynamicsKarine Altisen, Stéphane Devismes, Anaïs Durand, Franck Petit. 588-602 [doi]
- High Performance Polar Decomposition on Distributed Memory SystemsDalal Sukkari, Hatem Ltaief, David E. Keyes. 605-616 [doi]
- A Synchronization-Free Algorithm for Parallel Sparse Triangular SolvesWeifeng Liu 0002, Ang Li, Jonathan Hogg, Iain S. Duff, Brian Vinter. 617-630 [doi]
- Exploiting Task-Parallelism in Message-Passing Sparse Linear System Solvers Using OmpSsJosé Ignacio Aliaga, Maria Barreda, Matthias Bollhöfer, Enrique S. Quintana-Ortí. 631-643 [doi]
- Lightweight and Accurate Silent Data Corruption Detection in Ordinary Differential Equation SolversPierre-Louis Guhur, Hong Zhang, Tom Peterka, Emil M. Constantinescu, Franck Cappello. 644-656 [doi]
- High-Performance Matrix-Matrix Multiplications of Very Small MatricesIan Masliah, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Marc Baboulin, Joël Falcou, Jack J. Dongarra. 659-671 [doi]
- Effective Minimally-Invasive GPU Acceleration of Distributed Sparse Matrix FactorizationAnshul Gupta, Natalia Gimelshein, Seid Koric, Steven C. Rennich. 672-683 [doi]
- Automatic OpenCL Task Adaptation for Heterogeneous ArchitecturesPierre Huchant, Marie Christine Counilh, Denis Barthou. 684-696 [doi]