Abstract is missing.
- Profiling Intel Graphics Architecture with Long Instruction TracesKonstantin Levit-Gurevich, Alex Skaletsky, Michael Berezalsky, Yulia Kuznetcova, Hila Yakov. 1-11 [doi]
- Performance Analysis and Optimization with Little's LawSanyam Mehta. 12-23 [doi]
- The Indigo Program-Verification Microbenchmark Suite of Irregular Parallel Code PatternsYiqian Liu, Noushin Azami, Corbin Walters, Martin Burtscher. 24-34 [doi]
- gpuFI-4: A Microarchitecture-Level Framework for Assessing the Cross-Layer Resilience of Nvidia GPUsDimitris Sartzetakis, George Papadimitriou 0001, Dimitris Gizopoulos. 35-45 [doi]
- Distilling the Real Cost of Production Garbage CollectorsZixian Cai, Stephen M. Blackburn, Michael D. Bond, Martin Maas 0001. 46-57 [doi]
- Scale-Model Architectural SimulationWenjie Liu, Wim Heirman, Stijn Eyerman, Shoaib Akram 0001, Lieven Eeckhout. 58-68 [doi]
- MEGsim: A Novel Methodology for Efficient Simulation of Graphics Workloads in GPUsJorge Ortiz, David Corbalán-Navarro, Juan L. Aragón, Antonio González 0001. 69-78 [doi]
- MARTA: Multi-configuration Assembly pRofiler and Toolkit for performance AnalysisMarcos Horro, Louis-Noël Pouchet, Gabriel Rodríguez 0001, Juan Touriño. 79-89 [doi]
- Left-shifter: A pre-silicon framework for usage model based performance verification of the PCIe interface in server processor system on chipsTessil Thomas, Bharath Venkatasubramanian, Dinesh Sthapit, Christopher Gray, Atresh Gummadavelly, Janick Bergeron, Pankaj Mehta, Prabu Thangamuthu. 90-98 [doi]
- FOURST: A code generator for FFT-based fast stencil computationsZafar Ahmad, Mohammad Mahdi Javanmard, Gregory Croisdale, Aaron Gregory, Pramod Ganapathi, Louis-Noël Pouchet, Rezaul Chowdhury. 99-108 [doi]
- Flexible Binary Instrumentation Framework to Profile Code Running on Intel GPUsAlex Skaletsky, Konstantin Levit-Gurevich, Michael Berezalsky, Yulia Kuznetcova, Hila Yakov. 109-120 [doi]
- POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement LearningShalini Jain 0002, Yashas Andaluri, S. VenkataKeerthy, Ramakrishna Upadrasta. 121-131 [doi]
- XFeatur: Hardware Feature Extraction for DNN Auto-tuningJorge Sierra Acosta, Andreas Diavastos, Antonio Gonzalez. 132-134 [doi]
- SGXGauge: A Comprehensive Benchmark Suite for Intel SGXSandeep Kumar, Abhisek Panda, Smruti R. Sarangi. 135-137 [doi]
- SAPCo Sort: optimizing Degree-Ordering for Power-Law GraphsMohsen Koohi Esfahani, Peter Kilpatrick, Hans Vandierendonck. 138-140 [doi]
- TILE-SIM: A Systematic Approach to Systolic Array-based Accelerator EvaluationYuhang Li, Mei Wen, Jiawei Fei, Junzhong Shen, Yasong Cao. 141-143 [doi]
- FASE: A Fast, Accurate and Seamless Emulator for Custom Numerical FormatsJohn Osorio, Adrià Armejach, Eric Petit, Greg Henry, Marc Casas. 144-146 [doi]
- Meterstick: Benchmarking Performance Variability in Cloud and Self-hosted Minecraft-like GamesJerrit Eickhoff, Jesse Donkervliet, Alexandru Iosup. 147-149 [doi]
- Simulating Noisy Channels in DNA StorageMayank Keoliya, Puru Sharma, Djordje Jevdjic. 150-152 [doi]
- OS-level Implications of Using DRAM Caches in Memory DisaggregationBin Gao, Hao-Wei Tee, Alireza Sanaee, Soh Boon Jun, Djordje Jevdjic. 153-155 [doi]
- VIPP: Validation-Included Precision-Parametric N-Body Benchmark SuiteShigeyuki Sato, Kota Iizuka, Naoki Yoshifuji, Masaki Natsume. 156-158 [doi]
- High-Performance Deployment of Text Detection Model: Compression and Hardware Platform considerationsNupur Sumeet, Karan Rawat, Manoj Nambiar. 159-161 [doi]
- Roofline Model for UAVs: A Bottleneck Analysis Tool for Onboard Compute Characterization of Autonomous Unmanned Aerial VehiclesSrivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Ninad Jadhav, Aleksandra Faust, Vijay Janapa Reddi. 162-174 [doi]
- RTRBench: A Benchmark Suite for Real-Time RoboticsMohammad Bakhshalipour, Maxim Likhachev, Phillip B. Gibbons. 175-186 [doi]
- Characterization of MPC-based Private Inference for Transformer-based ModelsYongqin Wang, G. Edward Suh, Wenjie Xiong, Benjamin Lefaudeux, Brian Knott, Murali Annavaram, Hsien-Hsin S. Lee. 187-197 [doi]
- Spatiotemporal Strategies for Long-Term FPGA Resource ManagementAtefeh Mehrabi, Daniel J. Sorin, Benjamin C. Lee. 198-209 [doi]
- Eris: Fault Injection and Tracking Framework for Reliability Analysis of Open-Source HardwareShubham Nema, Justin Kirschner, Debpratim Adak, Sapan Agarwal, Ben Feinberg, Arun F. Rodrigues, Matthew J. Marinella, Amro Awad. 210-220 [doi]
- Understanding Data Compression in Warehouse-Scale Datacenter ServicesGeonhwa Jeong, Bikash Sharma, Nick Terrell, Abhishek Dhanotia, Zhiwei Zhao, Niket Agarwal, Arun Kejariwal, Tushar Krishna. 221-223 [doi]
- LoopIn: A Loop-Based Simulation Sampling MechanismUday Kumar Reddy Vengalam, Anshujit Sharma, Michael C. Huang 0001. 224-226 [doi]
- Building a Performance Model for Deep Learning Recommendation Model Training on GPUsZhongyi Lin, Louis Feng, Ehsan K. Ardestani, Jaewon Lee, John Lundell, Changkyu Kim, Arun Kejariwal, John D. Owens. 227-229 [doi]
- Advancing Near-Data Processing with Precise Exceptions and Efficient Data FetchingSairo R. dos Santos, Tiago Rodrigo Kepe, Francis B. Moreira, Paulo C. Santos 0001, Marco A. Z. Alves. 230-232 [doi]
- Profiling an Architectural SimulatorNedasadat Taheri, Alexander Manely, Ahmni R. Pang, Mohammad Alian. 233-235 [doi]
- Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on Edge DevicesKshitij Bhardwaj, James Diffenderfer, Bhavya Kailkhura, Maya B. Gokhale. 236-238 [doi]
- GPUCalorie: Floorplan Estimation for GPU Thermal EvaluationMarcus Chow, Ali Jahanshahi, Ana Cardenas Beltran, Sheldon X.-D. Tan, Daniel Wong 0001. 239-241 [doi]
- ARBench: Augmented Reality Benchmark For Mobile DevicesSofiane Chetoui, Rahul Shahi, Seif Abdelaziz, Abhinav Golas, Farrukh Hijaz, Sherief Reda. 242-244 [doi]
- Cross-Level Characterization of Program Behavior : (Extended Poster Abstract)Li Tang, Scott Pakin. 245-247 [doi]
- A SIMT Analyzer for Multi-Threaded CPU ApplicationsAhmad Alawneh, Mahmoud Khairy, Timothy G. Rogers. 248-250 [doi]
- Microarchitectural Performance Evaluation of AV1 Video Encoding WorkloadsSteffen Jensen, Jaekyu Lee, Dam Sunwoo, Matthew J. Horsnell, Lizy K. John. 251-253 [doi]
- Ruby: Improving Hardware Efficiency for Tensor Algebra Accelerators Through Imperfect FactorizationMark Horeni, Pooria Taheri, Po-An Tsai, Angshuman Parashar, Joel S. Emer, Siddharth Joshi. 254-266 [doi]
- Pareto Rank Surrogate Model for Hardware-aware Neural Architecture SearchHadjer Benmeziane, Smaïl Niar, Hamza Ouarnoughi, Kaoutar El Maghraoui. 267-276 [doi]
- Learning A Continuous and Reconstructible Latent Space for Hardware Accelerator DesignQijing Huang 0001, Charles Hong, John Wawrzynek, Mahesh Subedar, Yakun Sophia Shao. 277-287 [doi]
- Bifrost: End-to-End Evaluation and optimization of Reconfigurable DNN AcceleratorsAxel Stjerngren, Perry Gibson, José Cano 0001. 288-299 [doi]
- PCMCsim: An Accurate Phase-Change Memory Controller Simulator and its Performance AnalysisHyokeun Lee, Hyungsuk Kim, Seokbo Shim, Seungyong Lee, Dosun Hong, Hyuk-Jae Lee, Hyun Kim. 300-310 [doi]
- Address Translation Conscious Caching and Prefetching for High Performance Cache HierarchyVasudha, Biswabandan Panda. 311-321 [doi]
- DRAM Bandwidth and Latency Stacks: Visualizing DRAM BottlenecksStijn Eyerman, Wim Heirman, Ibrahim Hur. 322-331 [doi]