Abstract is missing.
- BOLT: A Practical Binary Optimizer for Data Centers and BeyondMaksim Panchenko, Rafael Auler, Bill Nell, Guilherme Ottoni. 2-14 [doi]
- Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary ParallelisationRuoyu Zhou, Timothy M. Jones 0001. 15-25 [doi]
- Smokestack: Thwarting DOP Attacks with Runtime Stack Layout RandomizationMisiker Tadesse Aga, Todd M. Austin. 26-36 [doi]
- Automatic Equivalence Checking for Assembly Implementations of Cryptography LibrariesJay P. Lim, Santosh Nagarakatte. 37-49 [doi]
- CSOD: Context-Sensitive Overflow DetectionHongyu Liu, Sam Silvestro, Xiaoyin Wang, Lide Duan, Tongping Liu. 50-60 [doi]
- Reasoning about the Node.js Event Loop using Async GraphsHaiyang Sun, Daniele Bonetta, Filippo Schiavio, Walter Binder. 61-72 [doi]
- Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUsSimon Garcia De Gonzalo, Sitao Huang, Juan Gómez-Luna, Simon D. Hammond, Onur Mutlu, Wen-mei Hwu. 73-84 [doi]
- A Code Generator for High-Performance Tensor Contractions on GPUsJinsung Kim, Aravind Sukumaran-Rajam, Vineeth Thumma, Sriram Krishnamoorthy, Ajay Panyala, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan. 85-95 [doi]
- Transforming Query Sequences for High-Throughput B+ Tree Processing on Many-Core ProcessorsRuiqin Tian, Junqiao Qiu, Zhijia Zhao 0001, Xu Liu, Bin Ren. 96-108 [doi]
- Quantifying and Reducing Execution Variance in STM via Model Driven Commit OptimizationGirish Mururu, Ada Gavrilovska, Santosh Pande. 109-121 [doi]
- White-Box Program TuningWen-Chuan Lee, Yingqi Liu, Peng Liu, ShiQing Ma, Hongjun Choi, Xiangyu Zhang, Rajiv Gupta 0001. 122-135 [doi]
- Generation of In-Bounds Inputs for Arrays in Memory-Unsafe LanguagesMarcus Rodrigues, Breno Guimaraes, Fernando Magno Quintao pereira. 136-148 [doi]
- Function Merging by Sequence AlignmentRodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang 0001, Murray Cole, Hugh Leather. 149-163 [doi]
- An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time CompilersAleksandar Prokopec, Gilles Duboscq, David Leopoldseder, Thomas Würthinger. 164-179 [doi]
- Tensor Algebra Compilation with WorkspacesFredrik Kjolstad, Peter Ahrens, Shoaib Kamil, Saman P. Amarasinghe. 180-192 [doi]
- Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable CodeRiyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming Zhang, Patricia Suriana, Shoaib Kamil, Saman P. Amarasinghe. 193-205 [doi]
- Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse ElementsVasileios Porpodas, Rodrigo C. O. Rocha, Evgueni Brevnov, Luís F. W. Góes, Timothy Mattson. 206-216 [doi]
- Locus: A System and a Language for Program OptimizationThiago S. F. X. Teixeira, Corinne Ancourt, David A. Padua, William Gropp. 217-228 [doi]
- Decoding CUDA BinaryAri B. Hayes, Fei Hua, Jin Huang, Yan Hao Chen, Eddy Z. Zhang. 229-241 [doi]
- From Loop Fusion to Kernel Fusion: A Domain-Specific Approach to Locality OptimizationBo Qiao, Oliver Reiche, Frank Hannig, Jürgen Teich. 242-253 [doi]
- IGC: The Open Source Intel Graphics CompilerAnupama Chandrasekhar, Gang Chen, Po-Yu Chen, Wei-Yu Chen, Junjie Gu, Peng Guo, Shruthi Hebbur Prasanna Kumar, Guei-Yuan Lueh, Pankaj Mistry, Wei Pan, Thomas Raoux, Konrad Trifunovic. 254-265 [doi]
- Automatic Parallelization of Irregular x86-64 LoopsBrandon Neth, Michelle Mills Strout. 266 [doi]
- A Shared BTB Design for Multicore SystemsMoumita Das, Ansuman Banerjee, Bhaskar Sardar. 267-268 [doi]
- Optimizing RNA-RNA Interaction ComputationsSwetha Varadarajan. 269-270 [doi]
- Code Generation from Formal Models for Automatic RTOS PortabilityRenata Martins Gomes, Marcel Baunach. 271-272 [doi]
- Understanding RDMA Behavior in NUMA SystemsJacob Nelson, Roberto Palmieri. 273-274 [doi]
- Translating Traditional SIMD Instructions to Vector Length Agnostic ArchitecturesSheng-Yu Fu, Wei-Chung Hsu. 275 [doi]
- Accelerating GPU Computing at Runtime with Binary OptimizationGuangli Li, Lei Liu, Xiaobing Feng 0002. 276-277 [doi]
- Extending LLVM for Lightweight SPMD Vectorization: Using SIMD and Vector Instructions Easily from Any LanguageRobin Kruppe, Julian Oppermann, Lukas Sommer, Andreas Koch 0001. 278-279 [doi]
- Multi-target Compiler for the Deployment of Machine Learning ModelsOscar Castro-Lopez, Inés Fernando Vega López. 280-281 [doi]
- A Tool for Performance Analysis of GPU-Accelerated ApplicationsKeren Zhou, John M. Mellor-Crummey. 282 [doi]
- Kernel Fusion/Decomposition for Automatic GPU-OffloadingAlok Mishra, Martin Kong, Barbara M. Chapman. 283-284 [doi]
- Translating CUDA to OpenCL for Hardware Generation using Neural Machine TranslationYonghae Kim, Hyesoon Kim. 285-286 [doi]