Abstract is missing.
- A Closer Look at Lightweight Graph ReorderingPriyank Faldu, Jeff Diamond, Boris Grot. 1-13 [doi]
- Detecting Last-Level Cache Contention in Workload Colocation with Meta LearningHuanxing Shen, Cong Li. 14-23 [doi]
- Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUsValentin Radu, Kuba Kaszyk, Yuan Wen, Jack Turner, José Cano, Elliot J. Crowley, Björn Franke, Amos J. Storkey, Michael O'Boyle. 24-34 [doi]
- Characterizing the Deployment of Deep Neural Networks on Commercial Edge DevicesRamyad Hadidi, Jiashen Cao, Yilun Xie, Bahar Asgari, Tushar Krishna, Hyesoon Kim. 35-48 [doi]
- Trimming the Tail for Deterministic Read Performance in SSDsNima Elyasi, Changho Choi, Anand Sivasubramaniam, Jingpei Yang, Vijay Balakrishnan. 49-58 [doi]
- Optimizing Hyperplane Sweep Operations Using Asynchronous Multi-grain GPU TasksAnirudh Mohan Kaushik, Ashwin M. Aji, Muhammad Amber Hassaan, Noel Chalmers, Noah Wolfe, Scott Moe, Sooraj Puthoor, Bradford M. Beckmann. 59-69 [doi]
- Efficacy of Statistical Sampling on Contemporary Workloads: The Case of SPEC CPU2017Sarabjeet Singh, Manu Awasthi. 70-80 [doi]
- Autonomous Data-Race-Free GPU TestingTuan Ta, XianWei Zhang, Anthony Gutierrez, Bradford M. Beckmann. 81-92 [doi]
- SNU-NPB 2019: Parallelizing and Optimizing NPB in OpenCL and CUDA for Modern GPUsYoungdong Do, Hyungmo Kim, Pyeongseok Oh, Daeyoung Park, Jaejin Lee. 93-105 [doi]
- Workload-Aware DRAM Error Prediction using Machine LearningLev Mukhanov, Konstantinos Tovletoglou, Hans Vandierendonck, Dimitrios S. Nikolopoulos, Georgios Karakonstantis. 106-118 [doi]
- Multi-Bit Upsets Vulnerability Analysis of Modern MicroprocessorsAthanasios Chatzidimitriou, George Papadimitriou, Christos Gavanas, George Katsoridas, Dimitris Gizopoulos. 119-130 [doi]
- Deep Learning Language Modeling Workloads: Where Time Goes on Graphics ProcessorsAli Hadi Zadeh, Zissis Poulos, Andreas Moshovos. 131-142 [doi]
- Evaluation of Non-Volatile Memory Based Last Level Cache Given Modern Use Case BehaviorAlexander Hankin, Tomer Shapira, Karthik Sangaiah, Michael Lui, Mark Hempstead. 143-154 [doi]
- One Size Doesn't Fit All: Quantifying Performance Portability of Graph Applications on GPUsTyler Sorensen 0001, Sreepathi Pai, Alastair F. Donaldson. 155-166 [doi]
- BHive: A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance ModelsYishen Chen, Ajay Brahmakshatriya, Charith Mendis, Alex Renda, Eric Atkinson, Ondrej Sýkora, Saman P. Amarasinghe, Michael Carbin. 167-177 [doi]
- SimdHT-Bench: Characterizing SIMD-Aware Hash Table Designs on Emerging CPU ArchitecturesDipti Shankar, Xiaoyi Lu, Dhabaleswar K. D. K. Panda. 178-188 [doi]
- Characterizing Deep Learning Training Workloads on Alibaba-PAIMengdi Wang, Chen Meng, Guoping Long, Chuan Wu, Jun Yang, Wei Lin, Yangqing Jia. 189-202 [doi]
- An Overflow-free Quantized Memory Hierarchy in General-purpose ProcessorsMarzieh Lenjani, Patricia Gonzalez-Guerrero, Elaheh Sadredini, M. Arif Rahman, Mircea R. Stan. 203-215 [doi]
- Faster than Flash: An In-Depth Study of System Challenges for Emerging Ultra-Low Latency SSDsSungjoon Koh, Junhyeok Jang, Changrim Lee, Miryeong Kwon, Jie Zhang 0048, Myoungsoo Jung. 216-227 [doi]
- Branch Prediction Is Not A Solved Problem: Measurements, Opportunities, and Future DirectionsChit-Kwan Lin, Stephen J. Tarsa. 228-238 [doi]
- HolDCSim: A Holistic Simulator for Data CentersFan Yao, Kathy Ngyugen, Sai Santosh Dayapule, Jingxin Wu, Bingqian Lu, Suresh Subramaniam, Guru Venkataramani. 239-242 [doi]
- Optimizing GPU Cache Policies for MI WorkloadsJohnathan Alsop, XianWei Zhang, Tsung Tai Yeh, Bradford M. Beckmann, Matthew D. Sinclair, Srikant Bharadwaj, Alexandru Dutu, Anthony Gutierrez, Onur Kayiran, Michael LeBeane, Brandon Potter, Sooraj Puthoor. 243-248 [doi]
- Persistent Memory Workload Characterization: A Hardware PerspectiveXiao Liu, Bhaskar Jupudi, Pankaj Mehra, Jishen Zhao. 249-252 [doi]
- Fingerprinting Anomalous Computation with RNN for GPU-accelerated HPC MachinesPengfei Zou, Ang Li, Kevin Barker, Rong Ge 0002. 253-256 [doi]
- Performance-driven Programming of Multi-TFLOP Deep Learning AcceleratorsSwagath Venkataramani, Jungwook Choi, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan, Leland Chang. 257-262 [doi]
- Barrier Synchronization vs. Voltage Noise: A Quantitative AnalysisZamshed I. Chowdhury, S. Karen Khatamifard, Zhaoyong Zheng, Tali Moreshet, R. Iris Bahar, Ulya R. Karpuzcu. 263-267 [doi]
- Characterizing the Performance/Accuracy Tradeoff of High-Precision Applications via Auto-tuningRuidong Gu, Paul Beata, Michela Becchi. 268-272 [doi]