0 | -- | 0 | Arun Thangamani, V. Krishna Nandivada. Optimizing Remote Communication in X10 |
0 | -- | 0 | Ian Briggs, Arnab Das, Mark Baranowski, Vishal Chandra Sharma, Sriram Krishnamoorthy, Zvonimir Rakamaric, Ganesh Gopalakrishnan. FailAmp: Relativization Transformation for Soft Error Detection in Structured Address Generation |
0 | -- | 0 | Daniel Gerzhoy, Xiaowu Sun, Michael Zuzak, Donald Yeung. Nested MIMD-SIMD Parallelization for Heterogeneous Microprocessors |
0 | -- | 0 | Wenbin Jiang, Yang Ma, Bo Liu, Haikun Liu, Bing Bing Zhou, Jian Zhu, Song Wu 0001, Hai Jin 0001. Layup: Layer-adaptive and Multi-type Intermediate-oriented Memory Optimization for GPU-based CNNs |
0 | -- | 0 | Mostafa Koraei, Omid Fatemi, Magnus Jahre. DCMI: A Scalable Strategy for Accelerating Iterative Stencil Loops on FPGAs |
0 | -- | 0 | Salonik Resch, S. Karen Khatamifard, Zamshed Iqbal Chowdhury, Masoud Zabihi, Zhengyang Zhao, Jianping Wang 0006, Sachin S. Sapatnekar, Ulya R. Karpuzcu. PIMBALL: Binary Neural Networks in Spintronic Memory |
0 | -- | 0 | Jie Zhao 0002, Albert Cohen 0001. Flextended Tiles: A Flexible Extension of Overlapped Tiles for Polyhedral Compilation |
0 | -- | 0 | Reem Elkhouly, Mohammad A. Alshboul, Akihiro Hayashi, Yan Solihin, Keiji Kimura. Compiler-support for Critical Data Persistence in NVM |
0 | -- | 0 | Ahmad Yasin, Jawad Haj-Yahya, Yosi Ben-Asher, Avi Mendelson. A Metric-Guided Method for Discovering Impactful Features and Architectural Insights for Skylake-Based Processors |
0 | -- | 0 | Manuel Selva, Fabian Gruber, Diogo Sampaio, Christophe Guillon, Louis-Noël Pouchet, Fabrice Rastello. Building a Polyhedral Representation from an Instrumented Execution: Making Dynamic Analyses of Nonaffine Programs Scalable |
0 | -- | 0 | Leeor Peled, Uri C. Weiser, Yoav Etsion. A Neural Network Prefetcher for Arbitrary Memory Access Patterns |
0 | -- | 0 | Aristeidis Mastoras, Thomas R. Gross. Chunking for Dynamic Linear Pipelines |
0 | -- | 0 | Asif Ali Khan, Fazal Hameed, Robin Bläsing, Stuart S. P. Parkin, Jerónimo Castrillón. ShiftsReduce: Minimizing Shifts in Racetrack Memory 4.0 |
0 | -- | 0 | Kyle Daruwalla, Heng Zhuo, Rohit Shukla, Mikko H. Lipasti. BitSAD v2: Compiler Optimization and Analysis for Bitstream Computing |
0 | -- | 0 | Zhen Hang Jiang, Yunsi Fei, David R. Kaeli. Exploiting Bank Conflict-based Side-channel Timing Leakage of GPUs |
0 | -- | 0 | Michiel A. van der Vlag, Georgios Smaragdos, Zaid Al-Ars, Christos Strydis. Exploring Complex Brain-Simulation Workloads on Multi-GPU Deployments |
0 | -- | 0 | Sriseshan Srikanth, Anirudh Jain, Joseph M. Lennon, Thomas M. Conte, Erik DeBenedictis, Jeanine E. Cook. MetaStrider: Architectures for Scalable Memory-centric Reduction of Sparse Data Streams |
0 | -- | 0 | Lorenzo Chelini, Oleksandr Zinenko, Tobias Grosser, Henk Corporaal. Declarative Loop Tactics for Domain-specific Optimization |
0 | -- | 0 | Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary Devito, William S. Moses, Sven Verdoolaege, Andrew Adams, Albert Cohen 0001. The Next 700 Accelerated Layers: From Mathematical Expressions of Network Computation Graphs to Accelerated GPU Kernels, Automatically |
0 | -- | 0 | Khalid Ahmad, Hari Sundar, Mary W. Hall. Data-driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs |
0 | -- | 0 | Sergi Siso, Wes Armour, Jeyarajan Thiyagalingam. Evaluating Auto-Vectorizing Compilers through Objective Withdrawal of Useful Information |
0 | -- | 0 | Larisa Stoltzfus, Bastian Hagedorn, Michel Steuwer, Sergei Gorlatch, Christophe Dubach. Tiling Optimizations for Stencil Computations Using Rewrite Rules in Lift |
0 | -- | 0 | Chunwei Xia, Jiacheng Zhao, Huimin Cui, Xiaobing Feng 0002, Jingling Xue. DNNTune: Automatic Benchmarking DNN Models for Mobile-cloud Computing |