Abstract is missing.
- PFault: A General Framework for Analyzing the Reliability of High-Performance Parallel File SystemsJinrui Cao, Om Rameshwar Gatla, Mai Zheng, Dong Dai, Vidya Eswarappa, Yan Mu, Yong Chen 0001. 1-11 [doi]
- Rethinking Node Allocation Strategy for Data-intensive Applications in Consideration of Spatially Bursty I/OJie Yu, Guangming Liu, Xin Liu, Wenrui Dong, Xiaoyong Li, Yusheng Liu. 12-21 [doi]
- PA-SSD: A Page-Type Aware TLC SSD for Improved Write/Read Performance and Storage EfficiencyWenhui Zhang, Qiang Cao, Hong Jiang, Jie Yao. 22-32 [doi]
- IRIS: I/O Redirection via Integrated StorageAnthony Kougkas, Hariharan Devarajan, Xian-He Sun. 33-42 [doi]
- GRU: Exploring Computation and Data Redundancy via Partial GPU Computing Result ReuseHusheng Zhou, Soroush Bateni, Cong Liu 0005. 43-52 [doi]
- Warp-Consolidation: A Novel Execution Model for GPUsAng Li, Weifeng Liu 0002, Linnan Wang, Kevin J. Barker, Shuaiwen Leon Song. 53-64 [doi]
- Classification-Driven Search for Effective SM Partitioning in Multitasking GPUsXia Zhao, Zhiying Wang, Lieven Eeckhout. 65-75 [doi]
- The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPUBernhard Kerbl, Michael Kenzel, Joerg H. Mueller, Dieter Schmalstieg, Markus Steinberger. 76-85 [doi]
- Analysis-driven Engineering of Comparison-based Sorting Algorithms on GPUsBen Karsin, Volker Weichert, Henri Casanova, John Iacono, Nodari Sitchinava. 86-95 [doi]
- Optimizing Tensor Contractions in CCSD(T) for Efficient Execution on GPUsJinsung Kim, Aravind Sukumaran-Rajam, Changwan Hong, Ajay Panyala, Rohit Kumar Srivastava, Sriram Krishnamoorthy, P. Sadayappan. 96-106 [doi]
- A two-phase recovery mechanismZhaoxiang Jin, Soner Önder. 107-117 [doi]
- HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System ExplorationsReena Panda, Lizy K. John. 118-128 [doi]
- High-Performance, Low-Complexity Deadlock Avoidance for Arbitrary Topologies/RoutingsJose Antonio Pascual, Javier Navaridas. 129-138 [doi]
- ComPEND: Computation Pruning through Early Negative Detection for ReLU in a Deep Neural Network AcceleratorDongwoo Lee, Sungbum Kang, Kiyoung Choi. 139-148 [doi]
- CELIA: A Device and Architecture Co-Design Framework for STT-MRAM-Based Deep Learning AccelerationHao Yan, Hebin R. Cherian, Ethan C. Ahn, Lide Duan. 149-159 [doi]
- Directive-Based, High-Level Programming and Optimizations for High-Performance Computing with FPGAsJacob Lambert, Seyong Lee, Jungwon Kim, Jeffrey S. Vetter, Allen D. Malony. 160-171 [doi]
- ReGraph: A Graph Processing Framework that Alternately Shrinks and Repartitions the GraphXue Li, Mingxing Zhang, Kang Chen, Yongwei Wu. 172-183 [doi]
- cuMBIR: An Efficient Framework for Low-dose X-ray CT Image Reconstruction on GPUsXiuhong Li, Yun Liang 0001, Wentai Zhang, Taide Liu, Haochen Li, Guojie Luo, Ming Jiang 0001. 184-194 [doi]
- Zwift: A Programming Framework for High Performance Text Analytics on Compressed DataFeng Zhang 0007, Jidong Zhai, Xipeng Shen, Onur Mutlu, Wenguang Chen. 195-206 [doi]
- Reducing Data Movement on Large Shared Memory Systems by Exploiting Computation DependenciesIsaac Sánchez Barrera, Miquel Moretó, Eduard Ayguadé, Jesús Labarta, Mateo Valero, Marc Casas. 207-217 [doi]
- Runtime-Guided Management of Stacked DRAM Memories in Task Parallel ProgramsLluc Alvarez, Marc Casas, Jesús Labarta, Eduard Ayguadé, Mateo Valero, Miquel Moretó. 218-228 [doi]
- Optimizing Data Aggregation by Leveraging the Deep Memory Hierarchy on Large-scale SystemsFrançois Tessier, Paul Gressier, Venkatram Vishwanath. 229-239 [doi]
- Automated Analysis of Time Series Data to Understand Parallel Program BehaviorsLai Wei, John M. Mellor-Crummey. 240-251 [doi]
- ChplBlamer: A Data-centric and Code-centric Combined Profiler for Multi-locale Chapel ProgramsHui Zhang, Jeffrey K. Hollingsworth. 252-262 [doi]
- ProfDP: A Lightweight Profiler to Guide Data Placement in Heterogeneous Memory SystemsShasha Wen, Lucy Cherkasova, Felix Xiaozhu Lin, Xu Liu 0001. 263-273 [doi]
- Phase-Aware Web Browser Power Management on HMP PlatformsNadja Peters, Sangyoung Park, Daniel Clifford, S. Kyostila, R. Mcllroy, Benedikt Meurer, Hannes Payer, Samarjit Chakraborty. 274-283 [doi]
- Demystifying Cache Policies for Photo Stores at Scale: A Tencent Case StudyKe Zhou 0001, Si Sun, Hua Wang, Ping Huang, Xubin He, Rui Lan, Wenyan Li, Wenjie Liu 0002, Tianming Yang. 284-294 [doi]
- Isometry: A Path-Based Distributed Data Transfer SystemZhihao Jia, Sean Treichler, Galen M. Shipman, Patrick S. McCormick, Alex Aiken. 295-306 [doi]
- Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed SystemsYang You, James Demmel, Cho-Jui Hsieh, Richard W. Vuduc. 307-317 [doi]
- Dynamic Load Balancing for Compressible Multiphase TurbulenceKeke Zhai, Tania Banerjee, David Zwick, Jason Hackl, Sanjay Ranka. 318-327 [doi]
- Revisiting Loop Tiling for Datacenters: Live and Let LiveJiacheng Zhao, Huimin Cui, Yalin Zhang, Jingling Xue, Xiaobing Feng 0002. 328-340 [doi]
- Sculptor: Flexible Approximation with Selective Dynamic Loop PerforationShikai Li, Sunghyun Park, Scott A. Mahlke. 341-351 [doi]
- A Case for Granularity Aware Page MigrationJee Ho Ryoo, Lizy K. John, Arkaprava Basu. 352-362 [doi]
- Towards Efficient SpMV on Sunway Manycore ArchitecturesChangxi Liu, Biwei Xie, Xin Liu, Wei Xue, Hailong Yang, Xu Liu 0001. 363-373 [doi]
- On Optimizing Distributed Tucker Decomposition for Sparse TensorsVenkatesan T. Chakaravarthy, Jee W. Choi, Douglas J. Joseph, Prakash Murali, Shivmaran S. Pandian, Yogish Sabharwal, Dheeraj Sreedhar. 374-384 [doi]
- Bootstrapping Parameter Space Exploration for Fast TuningJayaraman J. Thiagarajan, Nikhil Jain, Rushil Anirudh, Alfredo Giménez, Rahul Sridhar, Aniruddha Marathe, Tao Wang, Murali Emani, Abhinav Bhatele, Todd Gamblin. 385-395 [doi]