Abstract is missing.
- Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous SupercomputersAlexander Heinecke, Alexander Breuer, Sebastian Rettenberger, Michael Bader, Alice-Agnes Gabriel, Christian Pelties, Arndt Bode, William Barth, Xiang-Ke Liao, Karthikeyan Vaidyanathan, Mikhail Smelyanskiy, Pradeep Dubey. 3-14 [doi]
- Physics-Based Urban Earthquake Simulation Enhanced by 10.7 BlnDOF × 30 K Time-Step Unstructured FE Non-Linear Seismic Wave SimulationTsuyoshi Ichimura, Kohei Fujita, Seizo Tanaka, Muneo Hori, Maddegedara Maddegedara Lalith Lakshman, Yoshihisa Shizawa, Hiroshi Kobayashi. 15-26 [doi]
- Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100, 000× Reduction in Energy-to-SolutionAndrew S. Cassidy, Rodrigo Alvarez-Icaza, Filipp Akopyan, Jun Sawada, John V. Arthur, Paul Merolla, Pallab Datta, Marc González Tallada, Brian Taba, Alexander Andreopoulos, Arnon Amir, Steven K. Esser, Jeff Kusnitz, Rathinakumar Appuswamy, Chuck Haymes, Bernard Brezzo, Roger Moussalli, Ralph Bellofatto, Christian W. Baks, Michael Mastro, Kai Schleupen, Charles E. Cox, Ken Inoue, Steve Millman, Nabil Imam, Emmett McQuinn, Yutaka Y. Nakamura, Ivan Vo, Chen Guok, Don Nguyen, Scott Lekuch, Sameh W. Asaad, Daniel J. Friedman, Bryan L. Jackson, Myron Flickner, William P. Risk, Rajit Manohar, Dharmendra S. Modha. 27-38 [doi]
- Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics SupercomputerDavid E. Shaw, J. P. Grossman, Joseph A. Bank, Brannon Batson, J. Adam Butts, Jack C. Chao, Martin M. Deneroff, Ron O. Dror, Amos Even, Christopher H. Fenton, Anthony Forte, Joseph Gagliardo, Gennette Gill, Brian Greskamp, Richard C. Ho, Douglas J. Ierardi, Lev Iserovich, Jeffrey Kuskin, Richard H. Larson, Timothy Layman, Li-Siang Lee, Adam K. Lerer, Chester Li, Daniel Killebrew, Kenneth M. Mackenzie, Shark Yeuk-Hai Mok, Mark A. Moraes, Rolf Mueller, Lawrence J. Nociolo, Jon L. Peticolas, Terry Quan, Daniel Ramot, John K. Salmon, Daniele Paolo Scarpazza, U. Ben Schafer, Naseer Siddique, Christopher W. Snyder, Jochen Spengler, Ping Tak Peter Tang, Michael Theobald, Horia Toma, Brian Towles, Benjamin Vitale, Stanley C. Wang, Cliff Young. 41-53 [doi]
- 24.77 Pflops on a Gravitational Tree-Code to Simulate the Milky Way Galaxy with 18600 GPUsJeroen Bédorf, Evghenii Gaburov, Michiko S. Fujii, Keigo Nitadori, Tomoaki Ishiyama, Simon Portegies Zwart. 54-65 [doi]
- Lattice QCD with Domain Decomposition on Intel® Xeon Phi Co-ProcessorsSimon Heybrock, Bálint Joó, Dhiraj D. Kalamkar, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Tilo Wettig, Pradeep Dubey. 69-80 [doi]
- Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular SimulationJames C. Phillips, Yanhua Sun, Nikhil Jain, Eric J. Bohm, Laxmikant V. Kalé. 81-91 [doi]
- A Volume Integral Equation Stokes Solver for Problems with Variable CoefficientsDhairya Malhotra, Amir Gholami, George Biros. 92-102 [doi]
- Fence ScopingChanghui Lin, Vijay Nagarajan, Rajiv Gupta. 105-116 [doi]
- Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point AccuracyRalph Nathan, Bryan Anthonio, Shih-Lien Lu, Helia Naeimi, Daniel J. Sorin, Xiaobai Sun. 117-127 [doi]
- Managing DRAM Latency Divergence in Irregular GPGPU ApplicationsNiladrish Chatterjee, Mike O'Connor, Gabriel H. Loh, Nuwan Jayasena, Rajeev Balasubramonian. 128-139 [doi]
- CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace CompressionJidong Zhai, Jianfei Hu, Xiongchao Tang, Xiaosong Ma, Wenguang Chen. 143-153 [doi]
- The Lightweight Distributed Metric Service: A Scalable Infrastructure for Continuous Monitoring of Large Scale Computing Systems and ApplicationsAnthony Agelastos, Benjamin Allan, Jim M. Brandt, Paul Cassella, Jeremy Enos, Joshi Fullop, Ann C. Gentile, Steve Monk, Nichamon Naksinehaboon, Jeff Ogden, Mahesh Rajan, Michael T. Showerman, Joel Stevenson, Narate Taerat, Tom Tucker. 154-165 [doi]
- Dissecting On-Node Memory Access Performance: A Semantic ApproachAlfredo Giménez, Todd Gamblin, Barry Rountree, Abhinav Bhatele, Ilir Jusufi, Peer-Timo Bremer, Bernd Hamann. 166-176 [doi]
- Practical Symbolic Race Checking of GPU ProgramsPeng Li, Guodong Li, Ganesh Gopalakrishnan. 179-190 [doi]
- Scalable Kernel Fusion for Memory-Bound GPU ApplicationsMohamed Wahib, Naoya Maruyama. 191-202 [doi]
- A Unified Programming Model for Intra- and Inter-Node Offloading on Xeon Phi ClustersMatthias Noack, Florian Wende, Thomas Steinke, Frank Cordes. 203-214 [doi]
- Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File SystemsSarp Oral, James Simmons, Jason Hill, Dustin Leverman, Feiyi Wang, Matt Ezell, Ross Miller, Douglas Fuller, Raghul Gunasekaran, Youngjae Kim, Saurabh Gupta, Devesh Tiwari, Sudharshan S. Vazhkudai, James H. Rogers, David Dillow, Galen M. Shipman, Arthur S. Bland. 217-228 [doi]
- A User-Friendly Approach for Tuning Parallel File OperationsRobert McLay, Doug James, Si Liu, John Cazes, William Barth. 229-236 [doi]
- IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk InsertionKai Ren, Qing Zheng, Swapnil Patil, Garth A. Gibson. 237-248 [doi]
- High-Productivity Framework on GPU-Rich Supercomputers for Operational Weather Prediction Code ASUCATakashi Shimokawabe, Takayuki Aoki, Naoyuki Onodera. 251-261 [doi]
- Pipelining Computational Stages of the Tomographic Reconstructor for Multi-Object Adaptive Optics on a Multi-GPU SystemAli Charara, Hatem Ltaief, Damien Gratadour, David E. Keyes, Arnaud Sevin, Ahmad Abdelfattah, Eric Gendron, Carine Morel, Fabrice Vidal. 262-273 [doi]
- pTatin3D: High-Performance Methods for Long-Term Lithospheric DynamicsDave A. May, Jed Brown, Laetitia Le Pourhiet. 274-284 [doi]
- Oil and Water Can Mix: An Integration of Polyhedral and AST-Based TransformationsJun Shirako, Louis-Noël Pouchet, Vivek Sarkar. 287-298 [doi]
- Compiler Techniques for Massively Scalable Implicit Task ParallelismTimothy G. Armstrong, Justin M. Wozniak, Michael Wilde, Ian T. Foster. 299-310 [doi]
- MSL: A Synthesis Enabled Language for Distributed ImplementationsZhilei Xu, Shoaib Kamil, Armando Solar-Lezama. 311-322 [doi]
- RAHTM: Routing Algorithm Aware Hierarchical Task MappingAhmed H. Abdel-Gawad, Mithuna Thottethodi, Abhinav Bhatele. 325-335 [doi]
- Maximizing Throughput on a Dragonfly NetworkNikhil Jain, Abhinav Bhatele, Xiang Ni, Nicholas J. Wright, Laxmikant V. Kalé. 336-347 [doi]
- Slim Fly: A Cost Effective Low-Diameter Network TopologyMaciej Besta, Torsten Hoefler. 348-359 [doi]
- A Computation- and Communication-Optimal Parallel Direct 3-Body AlgorithmPenporn Koanantakool, Katherine A. Yelick. 363-374 [doi]
- A Communication-Optimal Framework for Contracting Distributed TensorsSamyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, Sriram Krishnamoorthy, P. Sadayappan. 375-386 [doi]
- Fast Parallel Computation of Longest Common PrefixesJulian Shun. 387-398 [doi]
- Fast Iterative Graph Computation: A Path Centric ApproachPingpeng Yuan, Wenya Zhang, Changfeng Xie, Hai Jin, Ling Liu, Kisung Lee. 401-412 [doi]
- Efficient I/O and Storage of Adaptive-Resolution DataSidharth Kumar, John Edwards, Peer-Timo Bremer, Aaron Knoll, Cameron Christensen, Venkatram Vishwanath, Philip H. Carns, John A. Schmidt, Valerio Pascucci. 413-423 [doi]
- An Image-Based Approach to Extreme Scale in Situ Visualization and AnalysisJames J. Ahrens, Sébastien Jourdain, Patrick O'Leary, John Patchett, David H. Rogers, Mark Petersen. 424-434 [doi]
- Parallel De Bruijn Graph Construction and Traversal for De Novo Genome AssemblyEvangelos Georganas, Aydin Buluç, Jarrod Chapman, Leonid Oliker, Daniel Rokhsar, Katherine A. Yelick. 437-448 [doi]
- Orion: Scaling Genomic Sequence Matching with Fine-Grained ParallelizationKanak Mahadik, Somali Chaterji, Bowen Zhou, Milind Kulkarni, Saurabh Bagchi. 449-460 [doi]
- Parallel Bayesian Network Structure Learning for Genome-Scale Gene NetworksSanchit Misra, Md. Vasimuddin, Kiran Pamnany, Sriram P. Chockalingam, Yong Dong, Min Xie, Maneesha R. Aluru, Srinivas Aluru. 461-472 [doi]
- Nonblocking Epochs in MPI One-Sided CommunicationJudicael A. Zounmevo, Xin Zhao, Pavan Balaji, William Gropp, Ahmad Afsahi. 475-486 [doi]
- Enabling Efficient Multithreaded MPI Communication through a Library-Based Implementation of MPI EndpointsSrinivas Sridharan, James Dinan, Dhiraj D. Kalamkar. 487-498 [doi]
- MC-Checker: Detecting Memory Consistency Errors in MPI One-Sided ApplicationsZhezhe Chen, James Dinan, Zhen Tang, Pavan Balaji, Hua Zhong, Jun Wei, Tao Huang, Feng Qin. 499-510 [doi]
- Scheduling Multi-tenant Cloud Workloads on Accelerator-Based SystemsDipanjan Sengupta, Anshuman Goswami, Karsten Schwan, Krishna Pallavi. 513-524 [doi]
- Scaling MapReduce Vertically and HorizontallyIsmail El-Helw, Rutger F. H. Hofman, Henri E. Bal. 525-535 [doi]
- The DRIHM Project: A Flexible Approach to Integrate HPC, Grid and Cloud Resources for Hydro-Meteorological ResearchDaniele D'Agostino, Andrea Clematis, Antonella Galizia, Alfonso Quarati, Emanuele Danovaro, Luca Roverelli, Gabriele Zereik, Dieter Kranzlmüller, Michael Schiffers, Nils gentschen Felde, Christian Straube, Olivier Caumontz, Evelyne Richard, Luis Garrote, Quillon Harpham, H. R. A. Jagers, Vladimir Dimitrijevic, Ljiljana Dekic, Elisabetta Fiorizz, Fabio Delogu, Antonio Parodi. 536-546 [doi]
- Faster Parallel Traversal of Scale Free Graphs at Extreme Scale with Vertex DelegatesRoger A. Pearce, Maya Gokhale, Nancy M. Amato. 549-559 [doi]
- Pardicle: Parallel Approximate Density-Based ClusteringMd. Mostofa Ali Patwary, Nadathur Satish, Narayanan Sundaram, Fredrik Manne, Salman Habib, Pradeep Dubey. 560-571 [doi]
- Scalable and High Performance Betweenness Centrality on the GPUAdam McLaughlin, David A. Bader. 572-583 [doi]
- Understanding Soft Error Resiliency of Blue Gene/Q Compute Chip through Hardware Proton Irradiation and Software Fault InjectionChen-Yong Cher, Meeta Sharma Gupta, Pradip Bose, K. Paul Muller. 587-596 [doi]
- Fail-in-Place Network Design: Interaction Between Topology, Routing Algorithm and FailuresJens Domke, Torsten Hoefler, Satoshi Matsuoka. 597-608 [doi]
- Correctness Field Testing of Production and Decommissioned High Performance Computing Platforms at Los Alamos National LaboratorySarah E. Michalak, William N. Rust, John T. Dal, Rew J. Dubois, David H. DuBois. 609-619 [doi]
- Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns PredictionMatthieu Dorier, Shadi Ibrahim, Gabriel Antoniu, Robert B. Ross. 623-634 [doi]
- Two-Choice Randomized Dynamic I/O Scheduler for Object Storage SystemsDong Dai, Yong Chen, Dries Kimpe, Robert B. Ross. 635-646 [doi]
- Parallel Programming with Migratable Objects: Charm++ in PracticeBilge Acun, Abhishek Gupta, Nikhil Jain, Akhil Langer, Harshitha Menon, Eric Mikida, Xiang Ni, Michael P. Robson, Yanhua Sun, Ehsan Totoni, Lukasz Wesolowski, Laxmikant V. Kalé. 647-658 [doi]
- Metascalable Quantum Molecular Dynamics Simulations of Hydrogen-on-DemandKen-ichi Nomura, Rajiv K. Kalia, Aiichiro Nakano, Priya Vashishta, Kohei Shimamura, Fuyuki Shimojo, Manaschai Kunaseth, Paul C. Messina, Nichols A. Romerod. 661-673 [doi]
- Efficient Implementation of Many-Body Quantum Chemical Methods on the Intel® Xeon Phi CoprocessorEdoardo Aprà, Michael Klemm, Karol Kowalski. 674-684 [doi]
- Optimized Scheduling Strategies for Hybrid Density Functional theory Electronic Structure CalculationsWilliam Dawson, François Gygi. 685-692 [doi]
- Quantitatively Modeling Application Resilience with the Data Vulnerability FactorLi Yu, Dong Li, Sparsh Mittal, Jeffrey S. Vetter. 695-706 [doi]
- A System Software Approach to Proactive Memory-Error AvoidanceCarlos H. A. Costa, Yoonho Park, Bryan S. Rosenburg, Chen-Yong Cher, Kyung Dong Ryu. 707-718 [doi]
- Fault-Tolerant Dynamic Task Graph SchedulingMehmet Can Kurt, Sriram Krishnamoorthy, Kunal Agrawal, Gagan Agrawal. 719-730 [doi]
- NUMARCK: Machine Learning Algorithm for Resiliency and CheckpointingZhengzhang Chen, Seung Woo Son, William Hendrix, Ankit Agrawal, Wei-keng Liao, Alok N. Choudhary. 733-744 [doi]
- Parallel Deep Neural Network Training for Big Data on Blue Gene/QI.-Hsin Chung, Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Vernon Austel, Upendra Chauhari, Brian Kingsbury. 745-753 [doi]
- FAST: Near Real-Time Searchable Data Analytics for the CloudYu Hua, Hong Jiang, Dan Feng. 754-765 [doi]
- Efficient Sparse Matrix-Vector Multiplication on GPUs Using the CSR Storage FormatJoseph L. Greathouse, Mayank Daga. 769-780 [doi]
- Fast Sparse Matrix-Vector Multiplication on GPUs for Graph ApplicationsArash Ashari, Naser Sedaghati, John Eisenlohr, Srinivasan Parthasarathy, P. Sadayappan. 781-792 [doi]
- A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE SolversCatherine Mills Olschanowsky, Michelle Mills Strout, Stephen Guzik, John Loffeld, Jeffrey Hittinger. 793-804 [doi]
- Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power BudgetOsman Sarood, Akhil Langer, Abhishek Gupta, Laxmikant V. Kalé. 807-818 [doi]
- Application Centric Energy-Efficiency Study of Distributed Multi-Core and Hybrid CPU-GPU SystemsBen Cumming, Gilles Fourestey, Oliver Fuhrer, Tobias Gysi, Massimiliano Fatica, Thomas C. Schulthess. 819-829 [doi]
- Scaling the Power Wall: A Path to ExascaleOreste Villa, Daniel R. Johnson, Mike Oconnor, Evgeny Bolotin, David W. Nellans, Justin Luitjens, Nikolai Sakharnykh, Peng Wang, Paulius Micikevicius, Anthony Scudiero, Stephen W. Keckler, William J. Dally. 830-841 [doi]
- Structure Slicing: Extending Logical Regions with FieldsMichael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken. 845-856 [doi]
- Optimizing Data Locality for Fork/Join Programs Using Constrained Work StealingJonathan Lifflander, Sriram Krishnamoorthy, Laxmikant V. Kalé. 857-868 [doi]
- DISC: A Domain-Interaction Based Programming Model with Support for Heterogeneous ExecutionMehmet Can Kurt, Gagan Agrawal. 869-880 [doi]
- Understanding the Effects of Communication and Coordination on Checkpointing at ScaleKurt B. Ferreira, Patrick Widener, Scott Levy, Dorian C. Arnold, Torsten Hoefler. 883-894 [doi]
- Exploring Automatic, Online Failure Recovery for Scientific Applications at Extreme ScalesMarc Gamell, Daniel S. Katz, Hemanth Kolla, Jacqueline Chen, Scott Klasky, Manish Parashar. 895-906 [doi]
- Optimization of a Multilevel Checkpoint Model with Uncertain Execution ScalesSheng Di, Leonardo Bautista-Gome, Franck Cappello. 907-918 [doi]
- Parallelization of Reordering Algorithms for Bandwidth and Wavefront ReductionKonstantinos I. Karantasis, Andrew Lenharth, Donald Nguyen, María Jesús Garzarán, Keshav Pingali. 921-932 [doi]
- Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU ClusterIchitaro Yamazaki, Sivasankaran Rajamanickam, Erik G. Boman, Mark Hoemmen, Michael A. Heroux, Stanimire Tomov. 933-944 [doi]
- Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured MatricesJongSoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Xing Liu, Md. Mostofa Ali Patwary, Yutong Lu, Pradeep Dubey. 945-955 [doi]
- FlexSlot: Moving Hadoop Into the Cloud with Flexible Slot ManagementYanfei Guo, Jia Rao, Changjun Jiang, Xiaobo Zhou. 959-969 [doi]
- Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS CloudsHaikun Liu, Bingsheng He. 970-981 [doi]
- Finding Constant from Change: Revisiting Network Performance Aware Optimizations on IaaS CloudsYifan Gong, Bingsheng He, Dan Li. 982-993 [doi]
- High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay TessellationTom Peterka, Dmitriy Morozov, Carolyn L. Phillips. 997-1007 [doi]
- Scalable Computation of Stream Surfaces on Large Scale Vector FieldsKewei Lu, Han-Wei Shen, Tom Peterka. 1008-1019 [doi]
- In-Situ Feature Extraction of Large Scale Combustion Simulations Using Segmented Merge TreesAaditya G. Landge, Valerio Pascucci, Attila Gyulassy, Janine Bennett, Hemanth Kolla, Jacqueline Chen, Peer-Timo Bremer. 1020-1031 [doi]
- ECC Parity: A Technique for Efficient Memory Error Resilience for Multi-Channel Memory SystemsXun Jian, Rakesh Kumar 0002. 1035-1046 [doi]
- Using an Adaptive HPC Runtime System to Reconfigure the Cache HierarchyEhsan Totoni, Josep Torrellas, Laxmikant V. Kalé. 1047-1058 [doi]
- Microbank: Architecting Through-Silicon Interposer-Based Main Memory SystemsYoung Hoon Son, O. Seongil, Hyunggyun Yang, Daejin Jung, Jung Ho Ahn, John Kim, Jangwoo Kim, Jae W. Lee. 1059-1070 [doi]