Abstract is missing.
- Skipping Non-essential Instructions Makes Data-Dependence Profiling FasterNicolas Morew, Mohammad Norouzi Arab, Ali Jannesari, Felix Wolf 0001. 3-17 [doi]
- A Toolchain to Verify the Parallelization of OmpSs-2 ApplicationsSimone Economo, Sara Royuela, Eduard Ayguadé, Vicenç Beltran 0001. 18-33 [doi]
- Towards a Model to Estimate the Reliability of Large-Scale Hybrid SupercomputersElvis Rojas, Esteban Meneses, Terry Jones, Don Maxwell. 37-51 [doi]
- A Learning-Based Approach for Evaluating the Capacity of Data Processing PipelinesMaha Alsayasneh, Noel De Palma. 52-67 [doi]
- Operation-Aware Power CappingBo Wang, Julian Miller, Christian Terboven, Matthias S. Müller. 68-82 [doi]
- A Comparison of the Scalability of OpenMP ImplementationsTim Jammer, Christian Iwainsky, Christian H. Bischof. 83-97 [doi]
- Evaluating the Effectiveness of a Vector-Length-Agnostic Instruction SetAndrei Poenaru, Simon McIntosh-Smith. 98-114 [doi]
- Parallel Scheduling of Data-Intensive TasksXiao Meng, Lukasz Golab. 117-133 [doi]
- A Makespan Lower Bound for the Tiled Cholesky Factorization Based on ALAP ScheduleOlivier Beaumont, Julien Langou, Willy Quach, Alena Shilova. 134-150 [doi]
- Optimal GPU-CPU Offloading Strategies for Deep Neural Network TrainingOlivier Beaumont, Lionel Eyraud-Dubois, Alena Shilova. 151-166 [doi]
- Improving Mapping for Sparse Direct Solvers - A Trade-Off Between Data Locality and Load BalancingChangjiang Gou, Ali Al Zoobi, Anne Benoit, Mathieu Faverge, Loris Marchal, Grégoire Pichon, Pierre Ramet. 167-182 [doi]
- Modelling Standard and Randomized Slimmed Folded Clos NetworksCristóbal Camarero, Javier Corral, Carmen Martínez, Ramón Beivide. 185-199 [doi]
- OmpMemOpt: Optimized Memory Movement for Heterogeneous ComputingPrithayan Barua, Jisheng Zhao, Vivek Sarkar. 200-216 [doi]
- Accelerating Deep Learning Inference with Cross-Layer Data Reuse on GPUsXueying Wang, Guangli Li, Xiao Dong, Jiansong Li, Lei Liu 0030, Xiaobing Feng 0002. 219-233 [doi]
- Distributed Fine-Grained Traffic Speed Prediction for Large-Scale Transportation Networks Based on Automatic LSTM Customization and SharingMing-Chang Lee, Jia-Chun Lin, Ernst Gunnar Gran. 234-247 [doi]
- Optimizing FFT-Based Convolution on ARMv8 Multi-core CPUsQinglin Wang, Dongsheng Li, Xiandong Huang, Siqi Shen, Songzhu Mei, Jie Liu. 248-262 [doi]
- Maximizing I/O Bandwidth for Reverse Time Migration on Heterogeneous Large-Scale SystemsTariq Alturkestani, Hatem Ltaief, David E. Keyes. 263-278 [doi]
- TorqueDB: Distributed Querying of Time-Series Data from Edge-local StorageDhruv Garg, Prathik Shirolkar, Anshu Shukla, Yogesh Simmhan. 281-295 [doi]
- Data-Centric Distributed Computing on Networks of Mobile DevicesPedro Sanches, João A. Silva, António Teófilo, Hervé Paulino. 296-311 [doi]
- WPSP: A Multi-correlated Weighted Policy for VM Selection and Migration for Cloud ComputingSergi Vila, Josep L. Lérida, Fernando Cores, Fernando Guirado, Fábio L. Verdi. 312-326 [doi]
- LCP-Aware Parallel String SortingJonas Ellert, Johannes Fischer 0001, Nodari Sitchinava. 329-342 [doi]
- Mobile RAM and Shape Formation by Programmable ParticlesGiuseppe Antonio Di Luna, Paola Flocchini, Nicola Santoro, Giovanni Viglietta, Yukiko Yamauchi. 343-358 [doi]
- Approximation Algorithm for Estimating Distances in Distributed Virtual EnvironmentsOlivier Beaumont, Tobias Castanet, Nicolas Hanusse, Corentin Travers. 359-375 [doi]
- On the Power of Randomization in Distributed Algorithms in Dynamic Networks with Adaptive AdversariesIrvan Jahja, Haifeng Yu, Ruomu Hou. 376-391 [doi]
- 3D Coded SUMMA: Communication-Efficient and Robust Parallel Matrix MultiplicationHaewon Jeong, Yaoqing Yang, Vipul Gupta, Christian Engelmann, Tze Meng Low, Viveck R. Cadambe, Kannan Ramchandran, Pulkit Grover. 392-407 [doi]
- Managing Failures in Task-Based Parallel Workflows in Distributed Computing EnvironmentsJorge Ejarque, Marta Bertran, Javier Álvarez Cid-Fuentes, Javier Conejero, Rosa M. Badia. 411-425 [doi]
- Accelerating Nested Data Parallelism: Preserving RegularityLars B. van den Haak, Trevor L. McDonell, Gabriele K. Keller, Ivo Gabe de Wolff. 426-442 [doi]
- Using Dynamic Broadcasts to Improve Task-Based Runtime PerformancesAlexandre Denis 0001, Emmanuel Jeannot, Philippe Swartvagher, Samuel Thibault. 443-457 [doi]
- A Compression-Based Design for Higher Throughput in a Lock-Free Hash MapPedro Moreno, Miguel Areias, Ricardo Rocha 0001. 458-473 [doi]
- NV-PhTM: An Efficient Phase-Based Transactional System for Non-volatile MemoryAlexandro Baldassin, Rafael Murari, João P. L. de Carvalho, Guido Araujo, Daniel Castro 0004, João Barreto 0001, Paolo Romano 0002. 477-492 [doi]
- Enhancing Resource Management Through Prediction-Based PoliciesAntoni Navarro, Arthur F. Lorenzon, Eduard Ayguadé, Vicenç Beltran 0001. 493-509 [doi]
- Accelerating Overlapping Community Detection: Performance Tuning a Stochastic Gradient Markov Chain Monte Carlo AlgorithmIsmail El-Helw, Rutger F. H. Hofman, Henri E. Bal. 510-526 [doi]
- A Prediction Framework for Fast Sparse Triangular SolvesNajeeb Ahmad, Buse Yilmaz, Didem Unat. 529-545 [doi]
- Multiprecision Block-Jacobi for Iterative Triangular SolvesFritz Göbel, Hartwig Anzt, Terry Cojean, Goran Flegar, Enrique S. Quintana-Ortí. 546-560 [doi]
- Efficient Ephemeris Models for Spacecraft Trajectory Simulations on GPUsFabian Schrammel, Florian Renk, Arya Mazaheri, Felix Wolf 0001. 561-577 [doi]
- Parallel Finite Cell Method with Adaptive Geometric MultigridSeyed Saberi, A. Vogel, Günther Meschke. 578-593 [doi]
- cuDTW++: Ultra-Fast Dynamic Time Warping on CUDA-Enabled GPUsBertil Schmidt, Christian Hundt 0002. 597-612 [doi]
- Heterogeneous CPU+iGPU Processing for Efficient Epistasis DetectionRafael Campos, Diogo Marques, Sergio Santander-Jiménez, Leonel Sousa, Aleksandar Ilic. 613-628 [doi]
- SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous ComputingSohan Lal, Aksel Alpay, Philip Salzmann, Biagio Cosenza, Alexander Hirsch, Nicolai Stawinoga, Peter Thoman, Thomas Fahringer, Vincent Heuveline. 629-644 [doi]