Abstract is missing.
- Polly-ACC Transparent compilation to heterogeneous hardwareTobias Grosser, Torsten Hoefler. 1 [doi]
- Hybrid CPU-GPU scheduling and execution of tree traversalsJianqiao Liu, Nikhil Hegde, Milind Kulkarni. 2 [doi]
- Exploiting Dynamic Reuse Probability to Manage Shared Last-level Caches in CPU-GPU Heterogeneous ProcessorsSiddharth Rai, Mainak Chaudhuri. 3 [doi]
- AEQUITAS: Coordinated Energy Management Across Parallel ApplicationsHaris Ribic, Yu David Liu. 4 [doi]
- Runtime-Guided Mitigation of Manufacturing Variability in Power-Constrained Multi-Socket NUMA NodesDimitrios Chasapis, Marc Casas, Miquel Moretó, Martin Schulz, Eduard Ayguadé, Jesús Labarta, Mateo Valero. 5 [doi]
- Variation Among Processors Under Turbo Boost in HPC SystemsBilge Acun, Phil Miller, Laxmikant V. Kalé. 6 [doi]
- Mini-Ckpts: Surviving OS Failures in Persistent MemoryDavid Fiala, Frank Mueller, Kurt B. Ferreira, Christian Engelmann. 7 [doi]
- High Performance Design for HDFS with Byte-Addressability of NVM and RDMANusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dhabaleswar K. Panda. 8 [doi]
- Write-Aware Management of NVM-based Memory ExtensionsAmro Awad, Sergey Blagodurov, Yan Solihin. 9 [doi]
- HOPE: Enabling Efficient Service Orchestration in Software-Defined Data CentersYang Hu, Chao Li, Longjun Liu, Tao Li. 10 [doi]
- Towards an Adaptive Multi-Power-Source DatacenterLongjun Liu, Hongbin Sun, Chao Li, Yang Hu, Nanning Zheng, Tao Li. 11 [doi]
- GreenGear: Leveraging and Managing Server Heterogeneity for Improving Energy Efficiency in Green Data CentersXu Zhou, Haoran Cai, Qiang Cao, Hong Jiang, Lei Tian, Changsheng Xie. 12 [doi]
- Noise Aware Scheduling in Data CentersHameedah Sultan, Arpit Katiyar, Smruti R. Sarangi. 13 [doi]
- Coherence-Free Multiview: Enabling Reference-Discerning Data Placement on GPUGuoyang Chen, Xipeng Shen. 14 [doi]
- SFU-Driven Transparent Approximation Acceleration on GPUsAng Li, Shuaiwen Leon Song, Mark Wijtvliet, Akash Kumar, Henk Corporaal. 15 [doi]
- Reusing Data Reorganization for Efficient SIMD Parallelization of Adaptive Irregular ApplicationsPeng Jiang, Linchuan Chen, Gagan Agrawal. 16 [doi]
- SReplay: Deterministic Sub-Group Replay for One-Sided CommunicationXuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu. 17 [doi]
- Lynx: Using OS and Hardware Support for Fast Fine-Grained Inter-Core CommunicationKonstantina Mitropoulou, Vasileios Porpodas, Xiaochun Zhang, Timothy M. Jones. 18 [doi]
- Efficient Timestamp-Based Cache Coherence Protocol for Many-Core ArchitecturesYuan Yao, Guanhua Wang, Zhiguo Ge, Tulika Mitra, Wenzhi Chen, Naxin Zhang. 19 [doi]
- BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU ComputingLinnan Wang, Wei Wu, Zenglin Xu, Jianxiong Xiao, Yi Yang. 20 [doi]
- Peruse and Profit: Estimating the Accelerability of LoopsSnehasish Kumar, Vijayalakshmi Srinivasan, Amirali Sharifian, Nick Sumner, Arrvindh Shriraman. 21 [doi]
- Simulation and Analysis Engine for Scale-Out WorkloadsNadav Chachmon, Daniel Richins, Robert S. Cohn, Magnus Christensson, Wenzhi Cui, Vijay Janapa Reddi. 22 [doi]
- Proteus: Exploiting Numerical Precision Variability in Deep Neural NetworksPatrick Judd, Jorge Albericio, Tayler H. Hetherington, Tor M. Aamodt, Natalie D. Enright Jerger, Andreas Moshovos. 23 [doi]
- Galaxyfly: A Novel Family of Flexible-Radix Low-Diameter Topologies for Large-Scales Interconnection NetworksFei Lei, Dezun Dong, Xiangke Liao, Xing Su, Cunlu Li. 24 [doi]
- Replichard: Towards Tradeoff between Consistency and Performance for MetadataZhiying Li, Ruini Xue, Lixiang Ao. 25 [doi]
- TokenTLB: A Token-Based Page Classification ApproachAlbert Esteve, Alberto Ros, Antonio Robles, María Engracia Gómez, José Duato. 26 [doi]
- Exploiting Private Local Memories to Reduce the Opportunity Cost of Accelerator IntegrationEmilio G. Cota, Paolo Mantovani, Luca P. Carloni. 27 [doi]
- GCaR: Garbage Collection aware Cache Management with Improved Performance for Flash-based SSDsSuzhen Wu, Yanping Lin, Bo Mao, Hong Jiang. 28 [doi]
- Fairness-oriented OS Scheduling Support for Multicore SystemsChangdae Kim, Jaehyuk Huh. 29 [doi]
- Scheduling Tasks with Mixed Timing Constraints in GPU-Powered Real-Time SystemsYunlong Xu, Rui Wang, Tao Li, Mingcong Song, Lan Gao, Zhongzhi Luan, Depei Qian. 30 [doi]
- CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUsMehmet E. Belviranli, Farzad Khorasani, Laxmi N. Bhuyan, Rajiv Gupta. 31 [doi]
- DSMR: A Parallel Algorithm for Single-Source Shortest Path ProblemSaeed Maleki, Donald Nguyen, Andrew Lenharth, María Jesús Garzarán, David A. Padua, Keshav Pingali. 32 [doi]
- Parallel Transposition of Sparse Data StructuresHao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng. 33 [doi]
- SARVAVID: A Domain Specific Language for Developing Scalable Computational Genomics ApplicationsKanak Mahadik, Christopher Wright, Jinyi Zhang, Milind Kulkarni, Saurabh Bagchi, Somali Chaterji. 34 [doi]
- Fast Multiplication in Binary Fields on GPUs via Register CacheEli Ben-Sasson, Matan Hamilis, Mark Silberstein, Eran Tromer. 35 [doi]
- Balanced Hashing and Efficient GPU Sparse General Matrix-Matrix MultiplicationPham Nguyen Quang Anh, Rui Fan, Yonggang Wen. 36 [doi]
- Optimizing Sparse Matrix-Vector Multiplication for Large-Scale Data AnalyticsDaniele Buono, Fabrizio Petrini, Fabio Checconi, Xing Liu, Xinyu Que, Chris Long, Tai-Ching Tuan. 37 [doi]
- TurboTiling: Leveraging Prefetching to Boost Performance of Tiled CodesSanyam Mehta, Rajat Garg, Nishad Trivedi, Pen-Chung Yew. 38 [doi]
- Graph Prefetching Using Data Structure KnowledgeSam Ainsworth, Timothy M. Jones. 39 [doi]
- Prefetching Techniques for Near-memory Throughput ProcessorsReena Panda, Yasuko Eckert, Nuwan Jayasena, Onur Kayiran, Michael Boyer, Lizy Kurian John. 40 [doi]
- Origami: Folding Warps for Energy Efficient GPUsMohammad Abdel-Majeed, Daniel Wong 0001, Justin Kuang, Murali Annavaram. 41 [doi]
- Barrier-Aware Warp Scheduling for Throughput ProcessorsYuxi Liu, Zhibin Yu, Lieven Eeckhout, Vijay Janapa Reddi, Yingwei Luo, Xiaolin Wang, Zhenlin Wang, Cheng-Zhong Xu. 42 [doi]
- Tag-Split Cache for Efficient GPGPU Cache UtilizationLingda Li, Ari B. Hayes, Shuaiwen Leon Song, Eddy Z. Zhang. 43 [doi]