Journal: TACO

Volume 18, Issue 4

0 -- 0Cunlu Li, Dezun Dong, Shazhou Yang, Xiangke Liao, Guangyu Sun, Yongheng Liu. CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers
0 -- 0Jerzy Proficz. All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns
0 -- 0Matthew Tomei, Shomit Das, Mohammad Seyedzadeh, Philip Bedoukian, Bradford M. Beckmann, Rakesh Kumar 0002, David A. Wood 0001. Byte-Select Compression
0 -- 0Tobias Gysi, Christoph Müller, Oleksandr Zinenko, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser. Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-accelerated Climate Simulation
0 -- 0Joscha Benz, Oliver Bringmann 0001. Scenario-Aware Program Specialization for Timing Predictability
0 -- 0An Zou, Huifeng Zhu, Jingwen Leng, Xin He, Vijay Janapa Reddi, Christopher D. Gill, Xuan Zhang 0001. System-level Early-stage Modeling and Evaluation of IVR-assisted Processor Power Delivery System
0 -- 0Wonik Seo, Sanghoon Cha, Yeonjae Kim, Jaehyuk Huh, Jongse Park. SLO-Aware Inference Scheduler for Heterogeneous Processors in Edge Platforms
0 -- 0Kaustav Goswami 0002, Dip Sankar Banerjee, Shirshendu Das. Towards Enhanced System Efficiency while Mitigating Row Hammer
0 -- 0Rui Xu, Sheng Ma, Yaohua Wang, Xinhai Chen, Yang Guo 0003. Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks
0 -- 0Yasir Mahmood Qureshi, William Andrew Simon, Marina Zapater, Katzalin Olcoz, David Atienza. Gem5-X: A Many-core Heterogeneous Simulation Platform for Architectural Exploration and Optimization
0 -- 0Shounak Chakraborty 0001, Magnus Själander. WaFFLe: Gated Cache-Ways with Per-Core Fine-Grained DVFS for Reduced On-Chip Temperature and Leakage Consumption
0 -- 0Zhibing Sha, Jun Li 0062, Lihao Song, Jiewen Tang, Min Huang, Zhigang Cai, Lianju Qian, Jianwei Liao, Zhiming Liu 0001. Low I/O Intensity-aware Partial GC Scheduling to Reduce Long-tail Latency in SSDs
0 -- 0Tina Jung, Fabian Ritter 0002, Sebastian Hack. PICO: A Presburger In-bounds Check Optimization for Compiler-based Memory Safety Instrumentations
0 -- 0Aninda Manocha, Tyler Sorensen 0001, Esin Tureci, Opeoluwa Matthews, Juan L. Aragón, Margaret Martonosi. GraphAttack: Optimizing Data Supply for Graph Applications on In-Order Multicore Architectures
0 -- 0Syed Asad Alam, James Garland, David Gregg. Low-precision Logarithmic Number Systems: Beyond Base-2
0 -- 0Sriseshan Srikanth, Anirudh Jain, Thomas M. Conte, Erik P. DeBenedictis, Jeanine E. Cook. SortCache: Intelligent Cache Management for Accelerating Sparse Data Workloads
0 -- 0Paul Metzger, Volker Seeker, Christian Fensch, Murray Cole. Device Hopping: Transparent Mid-Kernel Runtime Switching for Heterogeneous Systems
0 -- 0Candace Walden, Devesh Singh, Meenatchi Jagasivamani, Shang Li 0001, Luyi Kang, Mehdi Asnaashari, Sylvain Dubois, Bruce L. Jacob, Donald Yeung. Monolithically Integrating Non-Volatile Main Memory over the Last-Level Cache
0 -- 0M. Hüsrev Cilasun, Salonik Resch, Zamshed I. Chowdhury, Erin Olson, Masoud Zabihi, Zhengyang Zhao, Thomas Peterson, Keshab K. Parhi, Jianping Wang 0006, Sachin S. Sapatnekar, Ulya R. Karpuzcu. Spiking Neural Networks in Spintronic Computational RAM
0 -- 0Yu Zhang 0027, Da-Peng, Xiaofei Liao, Hai Jin 0001, Haikun Liu, Lin Gu 0002, Bingsheng He. LargeGraph: An Efficient Dependency-Aware GPU-Accelerated Large-Scale Graph Processing

Volume 18, Issue 3

0 -- 0Wim Heirman, Stijn Eyerman, Kristof Du Bois, Ibrahim Hur. Automatic Sublining for Efficient Sparse Memory Accesses
0 -- 0Shoaib Akram 0001. Performance Evaluation of Intel Optane Memory for Managed Workloads
0 -- 0Hamza Omar, Omer Khan. PRISM: Strong Hardware Isolation-based Soft-Error Resilient Multicore Architecture with High Performance and Availability at Low Hardware Overheads
0 -- 0George Charitopoulos, Dionisios N. Pnevmatikatos, Georgi Gaydadjiev. MC-DeF: Creating Customized CGRAs for Dataflow Applications
0 -- 0Mustafa Cavus, Mohammed Shatnawi, Resit Sendag, Augustus K. Uht. Fast Key-Value Lookups with Node Tracker
0 -- 0Daniel Rodrigues Carvalho, André Seznec. Understanding Cache Compression
0 -- 0Michael Stokes, David B. Whalley, Soner Önder. Decreasing the Miss Rate and Eliminating the Performance Penalty of a Data Filter Cache
0 -- 0Sugandha Tiwari, Neel Gala, Chester Rebeiro, V. Kamakoti 0001. PERI: A Configurable Posit Enabled RISC-V Core
0 -- 0Daniel Thuerck, Nicolas Weber, Roberto Bifulco. Flynn's Reconciliation: Automating the Register Cache Idiom for Cross-accelerator Programming
0 -- 0Weijia Song, Christina Delimitrou, Zhiming Shen, Robbert van Renesse, Hakim Weatherspoon, Lotfi Benmohamed, Frederic J. de Vaulx, Charif Mahmoudi. CacheInspector: Reverse Engineering Cache Resources in Public Clouds
0 -- 0Devashree Tripathy, AmirAli Abdolrashidi, Laxmi Narayan Bhuyan, Liang Zhou, Daniel Wong 0001. PAVER: Locality Graph-Based Thread Block Scheduling for GPUs
0 -- 0Ricardo Alves 0001, Stefanos Kaxiras, David Black-Schaffer. Early Address Prediction: Efficient Pipeline Prefetch and Reuse
0 -- 0João P. L. de Carvalho, Braedy Kuzma, Ivan Korostelev, José Nelson Amaral, Christopher Barton, José Moreira, Guido Araujo. KernelFaRer: Replacing Native-Code Idioms with High-Performance Library Calls
0 -- 0Ya-shuai Lü, Hui Guo 0004, Libo Huang, Qi Yu 0003, Li Shen 0007, Nong Xiao, Zhiying Wang 0003. GraphPEG: Accelerating Graph Processing on GPUs
0 -- 0Jose M. Rodriguez Borbon, Junjie Huang, Bryan M. Wong, Walid A. Najjar. Acceleration of Parallel-Blocked QR Decomposition of Tall-and-Skinny Matrices on FPGAs

Volume 18, Issue 2

0 -- 0Maxime France-Pillois, Jérôme Martin, Frédéric Rousseau. A Non-Intrusive Tool Chain to Optimize MPSoC End-to-End Systems
0 -- 0Pengyu Wang 0003, Jing Wang, Chao Li 0009, Jianzong Wang, Haojin Zhu, Minyi Guo. Grus: Toward Unified-memory-efficient High-performance Graph Processing on GPU
0 -- 0Arnab Kumar Biswas. Cryptographic Software IP Protection without Compromising Performance or Timing Side-channel Leakage
0 -- 0Nhut-Minh Ho, Himeshi De Silva, Weng-Fai Wong. GRAM: A Framework for Dynamically Mixing Precisions in GPU Applications
0 -- 0Muhammad Hassan, Chang Hyun Park 0001, David Black-Schaffer. A Reusable Characterization of the Memory System Behavior of SPEC2017 and SPEC2006
0 -- 0Ramin Izadpanah, Christina L. Peterson, Yan Solihin, Damian Dechev. PETRA: Persistent Transactional Non-blocking Linked Data Structures
0 -- 0Anirudh Mohan Kaushik, Gennady Pekhimenko, Hiren D. Patel. Gretch: A Hardware Prefetcher for Graph Analytics
0 -- 0Nils Voss, Bastiaan Kwaadgras, Oskar Mencer, Wayne Luk, Georgi Gaydadjiev. On Predictable Reconfigurable System Design

Volume 18, Issue 1

0 -- 0Kleovoulos Kalaitzidis, André Seznec. Leveraging Value Equality Prediction for Value Speculation
0 -- 0Paolo Sylos Labini, Marco Cianfriglia, Damiano Perri, Osvaldo Gervasi, Grigori Fursin, Anton Lokhmotov, Cedric Nugteren, Bruno Carpentieri, Fabiana Zollo, Flavio Vella. On the Anatomy of Predictive Models for Accelerating GPU Convolution Kernels and Beyond
0 -- 0Marcel Mettler, Daniel Mueller-Gritschneder, Ulf Schlichtmann. A Distributed Hardware Monitoring System for Runtime Verification on Multi-Tile MPSoCs
0 -- 0Abhishek Singh, Shail Dave, Pantea Zardoshti, Robert Brotzman, Chao Zhang 0039, Xiaochen Guo, Aviral Shrivastava, Gang Tan, Michael F. Spear. SPX64: A Scratchpad Memory for General-purpose Microprocessors
0 -- 0Minsu Kim, Jeong-Keun Park, Soo-Mook Moon. Irregular Register Allocation for Translation of Test-pattern Programs
0 -- 0Syed Mohammad Asad Hassan Jafri, Hasan Hassan, Ahmed Hemani, Onur Mutlu. Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators
0 -- 0Sooraj Puthoor, Mikko H. Lipasti. Systems-on-Chip with Strong Ordering
0 -- 0Wenjie Liu, Shoaib Akram, Jennifer B. Sartor, Lieven Eeckhout. Reliability-aware Garbage Collection for Hybrid HBM-DRAM Memories
0 -- 0Solomon Abera, M. Balakrishnan, Anshul Kumar. Performance-Energy Trade-off in Modern CMPs
0 -- 0Ari Rasch, Richard Schulze, Michel Steuwer, Sergei Gorlatch. Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)
0 -- 0Lorenz Braun, Sotirios Nikas, Chen Song, Vincent Heuveline, Holger Fröning. A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels
0 -- 0Atefeh Mehrabi, Aninda Manocha, Benjamin C. Lee, Daniel J. Sorin. Bayesian Optimization for Efficient Accelerator Synthesis
0 -- 0Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang 0020, Kim M. Hazelwood, David Brooks 0001. Exploiting Parallelism Opportunities with Deep Learning Frameworks
0 -- 0Sujay Yadalam, Vinod Ganapathy, Arkaprava Basu. XL: Security and Performance for Enclaves Using Large Pages
0 -- 0Negin Nematollahi, Mohammad Sadrosadati, Hajar Falahati, Marzieh Barkhordar, Mario Paulo Drumond, Hamid Sarbazi-Azad, Babak Falsafi. Efficient Nearest-Neighbor Data Sharing in GPUs
0 -- 0Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Bharat Kaul, Gagandeep Goyal, Ramakrishna Upadrasta. PolyDL: Polyhedral Optimizations for Creation of High-performance DL Primitives