Journal: TACO

Volume 19, Issue 4

0 -- 0Ruobing Han, Jaewon Lee, Jaewoong Sim, Hyesoon Kim. COX : Exposing CUDA Warp-level Functions to CPUs
0 -- 0Yuhao Li, Benjamin C. Lee. Phronesis: Efficient Performance Modeling for High-dimensional Configuration Tuning
0 -- 0Yiding Liu, Xingyao Zhang, Donglin Zhuang, Xin Fu, Shuaiwen Song. DynamAP: Architectural Support for Dynamic Graph Traversal on the Automata Processor
0 -- 0Tim Hartley, Foivos S. Zakkak, Andy Nisbet, Christos Kotselidis, Mikel Luján. Just-In-Time Compilation on ARM - A Closer Look at Call-Site Code Consistency
0 -- 0Aart J. C. Bik, Penporn Koanantakool, Tatiana Shpeisman, Nicolas Vasilache, Bixia Zheng, Fredrik Kjolstad. Compiler Support for Sparse Tensor Computations in MLIR
0 -- 0Chao Zhang 0039, Maximilian H. Bremer, Cy P. Chan, John Shalf, Xiaochen Guo. ASA: Accelerating Sparse Accumulation in Column-wise SpGEMM
0 -- 0Changwei Zou, Yaoqing Gao, Jingling Xue. Practical Software-Based Shadow Stacks on x86-64
0 -- 0Amirreza Yousefzadeh, Jan Stuijt, Martijn Hijdra, Hsiao-Hsuan Liu, Anteneh Gebregiorgis, Abhairaj Singh, Said Hamdioui, Francky Catthoor. Energy-efficient In-Memory Address Calculation
0 -- 0Chandrahas Tirumalasetty, Chih-Chieh Chou, A. L. Narasimha Reddy, Paul Gratz, Ayman Abouelwafa. Reducing Minor Page Fault Overheads through Enhanced Page Walker
0 -- 0Pierre Michaud, Anis Peysieux. HAIR: Halving the Area of the Integer Register File with Odd/Even Banking
0 -- 0Lan Gao, Jing Wang, Weigong Zhang. Adaptive Contention Management for Fine-Grained Synchronization on Commodity GPUs
0 -- 0Erling Rennemo Jellum, Milica Orlandic, Edmund Brekke, Tor Arne Johansen, Torleiv H. Bryne. Solving Sparse Assignment Problems on FPGAs
0 -- 0Jiansong Li, Xueying Wang, Xiaobing Chen, Guangli Li, Xiao Dong, Peng Zhao 0008, Xianzhi Yu, Yongxin Yang, Wei Cao, Lei Liu 0030, Xiaobing Feng 0002. An Application-oblivious Memory Scheduling System for DNN Accelerators
0 -- 0Aditya Narayan, Yvain Thonnart, Pascal Vivet, Ayse K. Coskun, Ajay Joshi. Architecting Optically Controlled Phase Change Memory
0 -- 0Hwisoo So, Moslem Didehban, Yohan Ko, Aviral Shrivastava, Kyoungwoo Lee. EXPERTISE: An Effective Software-level Redundant Multithreading Scheme against Hardware Faults

Volume 19, Issue 3

0 -- 0Ziaul Choudhury, Shashwat Shrivastava, Lavanya Ramapantulu, Suresh Purini. An FPGA Overlay for CNN Inference with Fine-grained Flexible Parallelism
0 -- 0Horng-Ruey Huang, Ding-Yong Hong, Jan-Jan Wu, Kung-Fu Chen, Pangfeng Liu, Wei-Chung Hsu. Accelerating Video Captioning on Heterogeneous System Architectures
0 -- 0Mohammadreza Soltaniyeh, Richard P. Martin, Santosh Nagarakatte. An Accelerator for Sparse Convolutional Neural Networks Leveraging Systolic General Matrix-matrix Multiplication
0 -- 0Marcel Mettler, Martin Rapp, Heba Khdr, Daniel Mueller-Gritschneder, Jörg Henkel, Ulf Schlichtmann. An FPGA-based Approach to Evaluate Thermal and Resource Management Strategies of Many-core Processors
0 -- 0Dharanidhar Dang, Bill Lin 0001, Debashis Sahoo. LiteCON: An All-photonic Neuromorphic Accelerator for Energy-efficient Deep Learning
0 -- 0Matthew Benjamin Olson, Brandon Kammerdiener, Michael R. Jantz, Kshitij A. Doshi, Terry Jones. Online Application Guidance for Heterogeneous Memory Systems
0 -- 0Bruno Chinelato Honorio, João P. L. de Carvalho, Catalina Munoz Morales, Alexandro Baldassin, Guido Araujo. Using Barrier Elision to Improve Transactional Code Generation
0 -- 0Diksha Moolchandani, Anshul Kumar, Smruti R. Sarangi. Performance and Power Prediction for Concurrent Execution on GPUs
0 -- 0Cunlu Li, Dezun Dong, Xiangke Liao. MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining Routers
0 -- 0Ping Wang, Fei Wen, Paul V. Gratz, Alex Sprintson. SIMD-Matcher: A SIMD-based Arbitrary Matching Framework
0 -- 0David Corbalán-Navarro, Juan L. Aragón, Martí Anglada, Joan-Manuel Parcerisa, Antonio González 0001. Triangle Dropping: An Occluded-geometry Predictor for Energy-efficient Mobile GPUs
0 -- 0Lokesh Siddhu, Rajesh Kedia, Shailja Pandey, Martin Rapp, Anuj Pathania, Jörg Henkel, Preeti Ranjan Panda. CoMeT: An Integrated Interval Thermal Simulation Toolchain for 2D, 2.5D, and 3D Processor-Memory Systems
0 -- 0Paschalis Mpeis, Pavlos Petoumenos, Kim M. Hazelwood, Hugh Leather. Object Intersection Captures on Interactive Apps to Drive a Crowd-sourced Replay-based Compiler Optimization
0 -- 0Peng Xu, Nannan Zhao, Jiguang Wan, Wei Liu, Shuning Chen, Yuanhui Zhou, Hadeel Albahar, Hanyang Liu, Liu Tang, Zhi-hu Tan. Building a Fast and Efficient LSM-tree Store by Integrating Local Storage with Cloud Storage
0 -- 0Johnathan Alsop, Weon Taek Na, Matthew D. Sinclair, Samuel Grayson, Sarita V. Adve. A Case for Fine-grain Coherence Specialization in Heterogeneous Systems
0 -- 0Ali Jahanshahi, Nanpeng Yu, Daniel Wong 0001. PowerMorph: QoS-Aware Server Power Reshaping for Data Center Regulation Service
0 -- 0Shivam Kundan, Theodoros Marinakis, Iraklis Anagnostopoulos, Dimitri Kagaris. A Pressure-Aware Policy for Contention Minimization on Multicore Systems

Volume 19, Issue 2

0 -- 0Ghassan Shobaki, Vahl Scott Gordon, Paul McHugh, Theodore Dubois, Austin Kerbow. Register-Pressure-Aware Instruction Scheduling Using Ant Colony Optimization
0 -- 0Rakesh Kumar 0003, Mehdi Alipour, David Black-Schaffer. Dependence-aware Slice Execution to Boost MLP in Slice-out-of-order Cores
0 -- 0Qihan Wang, Zhen Peng, Bin Ren, Jie Chen 0010, Robert G. Edwards. MemHC: An Optimized GPU Memory Management Framework for Accelerating Many-body Correlation
0 -- 0Nandita Vijaykumar, Ataberk Olgun, Konstantinos Kanellopoulos, F. Nisa Bostanci, Hasan Hassan, Mehrshad Lotfi, Phillip B. Gibbons, Onur Mutlu. MetaSys: A Practical Open-source Metadata Management System to Implement and Evaluate Cross-layer Optimizations
0 -- 0Kartik Lakshminarasimhan, Ajeya Naithani, Josué Feliu, Lieven Eeckhout. The Forward Slice Core: A High-Performance, Yet Low-Complexity Microarchitecture
0 -- 0Sharanyan Srikanthan, Sayak Chakraborti, Princeton Ferro, Sandhya Dwarkadas. MAPPER: Managing Application Performance via Parallel Efficiency Regulation*
0 -- 0Hugo Pompougnac, Ulysse Beaugnon, Albert Cohen 0001, Dumitru Potop-Butucaru. Weaving Synchronous Reactions into the Fabric of SSA-form Compilers
0 -- 0Mehrzad Nejat, Madhavan Manivannan, Miquel Pericàs, Per Stenström. Cooperative Slack Management: Saving Energy of Multicore Processors by Trading Performance Slack Between QoS-Constrained Applications
0 -- 0Christof Schlaak, Tzung-Han Juang, Christophe Dubach. Memory-Aware Functional IR for Higher-Level Synthesis of Accelerators
0 -- 0George Michelogiannakis, Benjamin Klenk, Brandon Cook 0001, Min Yee Teh, Madeleine Glick, Larry Dennison, Keren Bergman, John Shalf. A Case For Intra-rack Resource Disaggregation in HPC
0 -- 0Athanasios Tziouvaras, Georgios Dimitriou, Georgios I. Stamoulis. Low-power Near-data Instruction Execution Leveraging Opcode-based Timing Analysis
0 -- 0Xingguo Jia, Jin Zhang, Boshi Yu, Xingyue Qian, Zhengwei Qi, Haibing Guan. GiantVM: A Novel Distributed Hypervisor for Resource Aggregation with DSM-aware Optimizations
0 -- 0Jing Chen, Madhavan Manivannan, Mustafa Abduljabbar, Miquel Pericàs. ERASE: Energy Efficient Task Mapping and Resource Management for Work Stealing Runtimes
0 -- 0Chencheng Ye, Yuanchao Xu 0001, Xipeng Shen, Hai Jin 0001, Xiaofei Liao, Yan Solihin. Preserving Addressability Upon GC-Triggered Data Movements on Non-Volatile Memory

Volume 19, Issue 1

0 -- 0Cesar Gomes, Maziar Amiraski, Mark Hempstead. CASHT: Contention Analysis in Shared Hierarchies with Thefts
0 -- 0Yufei Wang, Xiaoshe Dong, Longxiang Wang, Weiduo Chen, Xingjun Zhang. Optimizing Small-Sample Disk Fault Detection Based on LSTM-GAN Model
0 -- 0Hongzhi Liu, Jie Luo, Ying Li, Zhonghai Wu. Iterative Compilation Optimization Based on Metric Learning and Collaborative Filtering
0 -- 0Mengya Lei, Fan Li, Fang Wang 0001, Dan Feng 0001, Xiaomin Zou, Renzhi Xiao. SecNVM: An Efficient and Write-Friendly Metadata Crash Consistency Scheme for Secure NVM
0 -- 0Aditya Ukarande, Suryakant Patidar, Ram Rangan. Locality-Aware CTA Scheduling for Gaming Applications
0 -- 0Chen Ding 0001, Dong Chen, Fangzhou Liu, Benjamin Reber, Wesley Smith. CARL: Compiler Assigned Reference Leasing
0 -- 0Bang Di, Daokun Hu, Zhen Xie, Jianhua Sun 0002, Hao Chen 0002, Jinkui Ren, Dong Li. TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling
0 -- 0Dennis Rieber, Axel Acosta 0001, Holger Fröning. Joint Program and Layout Transformations to Enable Convolutional Operators on Specialized Hardware Based on Constraint Programming
0 -- 0Daeyeal Lee, Bill Lin, Chung-Kuan Cheng. SMT-Based Contention-Free Task Mapping and Scheduling on 2D/3D SMART NoC with Mixed Dimension-Order Routing
0 -- 0Prasanth Chatarasi, Hyoukjun Kwon, Angshuman Parashar, Michael Pellauer, Tushar Krishna, Vivek Sarkar. Marvel: A Data-Centric Approach for Mapping Deep Learning Operators on Spatial Accelerators
0 -- 0Franyell Silfa, José María Arnau, Antonio González 0001. E-BATCH: Energy-Efficient and High-Throughput RNN Batching
0 -- 0Gururaj Saileshwar, Rick Boivie, Tong Chen, Benjamin Segal, Alper Buyuktosunoglu. HeapCheck: Low-cost Hardware Support for Memory Safety
0 -- 0Muhammad Aditya Sasongko, Milind Chabbi, Mandana Bagheri-Marzijarani, Didem Unat. ReuseTracker: Fast Yet Accurate Multicore Reuse Distance Analyzer
0 -- 0Muhammad Waqar Azhar, Miquel Pericàs, Per Stenström. Task-RM: A Resource Manager for Energy Reduction in Task-Parallel Applications under Quality of Service Constraints
0 -- 0Yaosheng Fu, Evgeny Bolotin, Niladrish Chatterjee, David W. Nellans, Stephen W. Keckler. GPU Domain Specialization via Composable On-Package Architecture