Abstract is missing.
- Characterization of Data Compression in DatacentersGeonhwa Jeong, Bikash Sharma, Nick Terrell, Abhishek Dhanotia, Zhiwei Zhao, Niket Agarwal, Arun Kejariwal, Tushar Krishna. 1-12 [doi]
- PES: An Energy and Throughput Model for Energy Harvesting IoT SystemsFatemeh Ghasemi, Lukas Liedtke, Magnus Jahre. 13-23 [doi]
- PyTFHE: An End-to-End Compilation and Execution Framework for Fully Homomorphic Encryption ApplicationsJiaao Ma, Ceyu Xu, Lisa Wu Wills. 24-34 [doi]
- Evaluating Machine LearningWorkloads on Memory-Centric Computing SystemsJuan Gómez-Luna, Yuxin Guo, Sylvan Brocard, Julien Legriel, Remy Cimadomo, Geraldo F. Oliveira, Gagandeep Singh 0002, Onur Mutlu. 35-49 [doi]
- MQL: ML-Assisted Queuing Latency Analysis for Data Center NetworksShruti Yadav Narayana, Jie Tong, Anish Krishnakumar, Nuriye Yildirim, Emily Shriver, Mahesh Ketkar, Ümit Y. Ogras. 50-60 [doi]
- A Characterization of the Effects of Software Instruction Prefetching on an Aggressive Front-endGino Chacon, Nathan Gober, Krishnendra Nathella, Paul V. Gratz, Daniel A. Jiménez. 61-70 [doi]
- MBPlib: Modular Branch Prediction LibraryEmilio Domínguez-Sánchez, Alberto Ros. 71-80 [doi]
- Evaluating the Impact of Optimizations for Dynamic Binary Modification on 64-bit RISC-VJohn Alistair Kressel, Guillermo Callaghan, Cosmin Gorgovan, Mikel Luján. 81-91 [doi]
- An Application-Oriented Approach to Designing Hybrid CPU ArchitecturesAnna Yue, Sanyam Mehta. 92-102 [doi]
- Profiling gem5 SimulatorJohnson Umeike, Neel Patel, Alex Manley, Amin Mamandipoor, Heechul Yun, Mohammad Alian. 103-113 [doi]
- A Novel Simulation Methodology for Silicon Photonic Switching FabricsMarkos Kynigos, Javier Navaridas, Jose Antonio Pascual, Mikel Luján. 114-123 [doi]
- Simulating Wrong-Path Instructions in Decoupled Functional-First SimulationStijn Eyerman, Sam Van den Steen, Wim Heirman, Ibrahim Hur. 124-133 [doi]
- Is the Future Cold or Tall? Design Space Exploration of Cryogenic and 3D Embedded Cache MemoryAlexander Hankin, Lillian Pentecost, Dongmoon Min, David Brooks 0001, Gu-Yeon Wei. 134-144 [doi]
- MergePath-SpMM: Parallel Sparse Matrix-Matrix Algorithm for Graph Neural Network AccelerationMohsin Shan, Deniz Gurevin, Jared Nye, Caiwen Ding, Omer Khan. 145-156 [doi]
- CFU Playground: Full-Stack Open-Source Framework for Tiny Machine Learning (TinyML) Acceleration on FPGAsShvetank Prakash, Tim Callahan, Joseph Bushagour, Colby R. Banbury, Alan V. Green, Pete Warden, Tim Ansell, Vijay Janapa Reddi. 157-167 [doi]
- ® PIUMAMatthew Joseph Adiletta, Jesmin Jahan Tithi, Emmanouil-Ioannis Farsarakis, Gerasimos Gerogiannis, Robert Adolf, Robert Benke, Sidharth Kashyap, Samuel Hsia, Kartik Lakhotia, Fabrizio Petrini, Gu-Yeon Wei, David Brooks 0001. 168-177 [doi]
- Genomics-GPU: A Benchmark Suite for GPU-accelerated Genome AnalysisZhuren Liu, Shouzhe Zhang, Justin Garrigus, Hui Zhao. 178-188 [doi]
- Exploring the Efficiency of Data-Oblivious ProgramsLauren Biernacki, Biniyam Mengist Tiruye, Meron Zerihun Demissie, Fitsum Assamnew Andargie, Brandon Reagen, Todd M. Austin. 189-200 [doi]
- Redwood: Flexible and Portable Heterogeneous Tree Traversal WorkloadsYanwen Xu, Ang Li, Tyler Sorensen 0001. 201-213 [doi]
- Community-based Matrix Reordering for Sparse Linear Algebra OptimizationVignesh Balaji, Neal Clayton Crago, Aamer Jaleel, Stephen W. Keckler. 214-223 [doi]
- Sieve: Stratified GPU-Compute Workload SamplingMahmood Naderan-Tahan, Hossein SeyyedAghaei, Lieven Eeckhout. 224-234 [doi]
- TransPimLib: Efficient Transcendental Functions for Processing-in-Memory SystemsMaurus Item, Geraldo F. Oliveira, Juan Gómez-Luna, Mohammad Sadrosadati, Yuxin Guo, Onur Mutlu. 235-247 [doi]
- Early-Adaptor: An Adaptive Framework forProactive UVM Memory ManagementSeokjin Go, Hyunwuk Lee, Junsung Kim, Jiwon Lee, Myung Kuk Yoon, Won Woo Ro. 248-258 [doi]
- Sunstone: A Scalable and Versatile Scheduler for Mapping Tensor Algebra on Spatial AcceleratorsMohammadHossein Olyaiy, Christopher Ng, Alexandra (Sasha) Fedorova, Mieszko Lis. 259-271 [doi]
- RPU: The Ring Processing UnitDeepraj Soni, Negar Neda, Naifeng Zhang, Benedict Reynwar, Homer Gamil, Benjamin Heyman, Mohammed Nabeel 0001, Ahmad Al Badawi, Yuriy Polyakov, Kellie Canida, Massoud Pedram, Michail Maniatakos, David Bruce Cousins, Franz Franchetti, Matthew French, Andrew G. Schmidt, Brandon Reagen. 272-282 [doi]
- ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at ScaleWilliam Won, Taekyung Heo, Saeed Rashidi, Srinivas Sridharan 0002, Sudarshan Srinivasan, Tushar Krishna. 283-294 [doi]
- Boreas: A Cost-Effective Mitigation Method for Advanced Hotspots using Machine Learning and Hardware TelemetryMaziar Amiraski, David Werner, Alexander Hankin, Julien Sebot, Kaushik Vaidyanathan, Mark Hempstead. 295-305 [doi]
- AMPeD: An Analytical Model for Performance in Distributed Training of TransformersDiksha Moolchandani, Joyjit Kundu, Frederik Ruelens, Peter Vrancx, Timon Evenblij, Manu Perumkunnil. 306-315 [doi]
- LoopTree: Enabling Exploration of Fused-layer Dataflow AcceleratorsMichael Gilbert, Yannan Nellie Wu, Angshuman Parashar, Vivienne Sze, Joel S. Emer. 316-318 [doi]
- Degree-Aware Kernel Mapping for Graph Processing on GPUsSanya Srivastava, Tyler Sorensen 0001. 319-321 [doi]
- lfbench: a lock-free microbenchmark suiteMahita Nagabhiru, Greg Byrd. 322-324 [doi]
- A Benchmark Suite for Improving Performance Portability of the SYCL Programming ModelZheming Jin, Jeffrey S. Vetter. 325-327 [doi]
- Impact of Optimal Design Point on Performance Metrics of DNN accelerators in FPGATom Glint, Aryan Gupta, Daniel Giftson, Gaurav Shah, Vrajesh Patel, Ruchit Chudasama, Sukanya More, Joycee Mekie. 328-330 [doi]
- Workload Characterization Using Hierarchical PCALina Sawalha, Grant Deljevic. 331-333 [doi]
- Analyzing Energy Efficiency of a Server with a SmartNIC under SLO ConstraintsJinghan Huang, Jiaqi Lou, Yan Sun, Tianchen Wang, Eun-Kyung Lee, Nam Sung Kim. 334-336 [doi]
- KORDI: A Framework for Real-Time Performance and Cost Optimization of Apache Spark StreamingAthanasios Kordelas, Thanasis Spyrou, Spyros Voulgaris, Vasileios Megalooikonomou, Nikos Deligiannis. 337-339 [doi]
- Enabling Design Space Exploration of DRAM Caches for Emerging Memory SystemsMaryam Babaie, Ayaz Akram, Jason Lowe-Power. 340-342 [doi]
- A Regression-based Model for End-to-End Latency Prediction for DNN Execution on GPUsYing Li, Yifan Sun 0002, Adwait Jog. 343-345 [doi]
- A survey and comparison of consistent hashing algorithmsMassimo Coluzzi, Amos Brocco, Patrizio Contu, Tiziano Leidi. 346-348 [doi]
- Analysis of Conventional, Near-Memory, and In-Memory DNN AcceleratorsTom Glint, Chandan Kumar Jha 0001, Manu Awasthi, Joycee Mekie. 349-351 [doi]
- RAINBOW: Multi-Dimensional Hardware-Software Co-Design for DL Accelerator On-Chip MemoryStavroula Zouzoula, Muhammad Waqar Azhar, Pedro Trancoso. 352-354 [doi]
- Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN AcceleratorsArne Symons, Linyan Mei, Steven Colleman, Pouya Houshmand, Sebastian Karl, Marian Verhelst. 355-357 [doi]