Abstract is missing.
- DACHash: A Dynamic, Cache-Aware and Concurrent Hash Table on GPUsHao Zhou, David Troendle, Byunghyun Jang. 1-10 [doi]
- A Low-Power Hardware Accelerator for ORB Feature Extraction in Self-Driving CarsRaúl Taranco, José María Arnau, Antonio González. 11-21 [doi]
- Opening the Black Box: Performance Estimation during Code Generation for GPUsDominik Ernst, Georg Hager, Matthias Knorr, Gerhard Wellein, Markus Holzer 0005. 22-32 [doi]
- SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge InferenceJude Haris, Perry Gibson, José Cano 0001, Nicolas Bohm Agostini, David R. Kaeli. 33-43 [doi]
- Improving Phased Transactional Memory via Commit Throughput and Capacity EstimationCatalina Munoz Morales, Bruno C. Honorio, Alexandro Baldassin, Guido Araujo. 44-53 [doi]
- Synchronization Strategies on Many-Core SMT SystemsAgustín Navarro-Torres, Jesús Alastruey-Benedé, Pablo Ibáñez-Marín, Maria Carpen Amarie. 54-63 [doi]
- Design and evaluation of associative processing kernelsJonathas Silveira, Lucas Wanner. 64-73 [doi]
- A Task-based Execution Engine for Distributed Operating Systems Tailored to Lightweight Manycores with Limited On-Chip MemoryJoão Vicente Souto, Márcio Castro 0001, Pedro Henrique Penna. 74-83 [doi]
- Efficient Tensor Slicing for Multicore NPUs using Memory Burst ModelingRafael C. F. Sousa, Byungmin Jung, Jaehwa Kwak, Michael Frank 0008, Guido Araujo. 84-93 [doi]
- Sparsity-aware Power Gating for Tensor CoresEhsan Atoofian. 94-103 [doi]
- Employing Simulation to Facilitate the Design of Dynamic Binary TranslatorsVanderson Martins do Rosario, Raphael Zinsly, Sandro Rigo, Edson Borin. 104-113 [doi]
- Register Flush-free Runahead Execution for Modern Vector ProcessorsHikaru Takayashiki, Masayuki Sato 0001, Kazuhiko Komatsu, Hiroaki Kobayashi. 114-125 [doi]
- Shelf schedules for independent moldable tasks to minimize the energy consumptionAnne Benoit, Louis-Claude Canon, Redouane Elghazi, Pierre-Cyrille Héam. 126-136 [doi]
- Enabling microservices management for Deep Learning applications across the Edge-Cloud ContinuumZeina Houmani, Daniel Balouek-Thomert, Eddy Caron, Manish Parashar. 137-146 [doi]
- FAIR: Fully-Adaptive Framework for Improving Resource Provisioning in Collaborative CPU-FPGA Cloud EnvironmentsMichael Guilherme Jordan, Guilherme Korol, Mateus Beck Rutzig, Antonio Carlos Schneider Beck. 147-156 [doi]
- HPC Data Storage at a Glance: The Santos Dumont ExperienceAndré Ramos Carneiro, Jean Luca Bez, Carla Osthoff, Lucas Mello Schnorr, Philippe O. A. Navaux. 157-166 [doi]
- Sparbit: a new logarithmic-cost and data locality-aware MPI Allgather algorithmWilton Jaciel Loch, Guilherme Piêgas Koslovski. 167-176 [doi]
- Efficient Online 4D Magnetic Resonance ImagingMarco Barbone, Andreas Wetscherek, Thomas Yung, Uwe Oelfke, Wayne Luk, Georgi Gaydadjiev. 177-187 [doi]
- Functional Approximation and Approximate Parallelization with the ACCEPT compilerLucas Reis, Lucas Wanner. 188-197 [doi]
- Sampling-based Sparse Format Selection on GPUsGangyi Zhu, Gagan Agrawal. 198-208 [doi]
- FSCHOL: An OpenCL-based HPC Framework for Accelerating Sparse Cholesky Factorization on FPGAsErfan Bank-Tavakoli, Michael Riera, Masudul Hassan Quraishi, Fengbo Ren. 209-220 [doi]