Abstract is missing.
- Massively parallel models of the human circulatory systemAmanda Randles, Erik W. Draeger, Tomas Oppelstrup, Liam Krauss, John A. Gunnels. 1 [doi]
- The in-silico lab-on-a-chip: petascale and high-throughput simulations of microfluidics at cell resolutionDiego Rossinelli, Yu-Hang Tang, Kirill Lykov, Dmitry Alexeev, Massimo Bernaschi, Panagiotis E. Hadjidoukas, Mauro Bisson, Wayne Joubert, Christian Conti, George E. Karniadakis, Massimiliano Fatica, Igor Pivkin, Petros Koumoutsakos. 2 [doi]
- ab-initio quantum transport simulations on hybrid supercomputersMauro Calderara, Sascha Brück, Andreas Pedersen, Mohammad H. Bani-Hashemian, Joost VandeVondele, Mathieu Luisier. 3 [doi]
- Implicit nonlinear wave simulation with 1.08T DOF and 0.270T unstructured finite elements to enhance comprehensive earthquake simulationTsuyoshi Ichimura, Kohei Fujita, Pher Errol Balde Quinay, Lalith Maddegedara, Muneo Hori, Seizo Tanaka, Yoshihisa Shizawa, Hiroshi Kobayashi, Kazuo Minami. 4 [doi]
- An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth's mantleJohann Rudi, A. Cristiano I. Malossi, Tobin Isaac, Georg Stadler, Michael Gurnis, Peter W. J. Staar, Yves Ineichen, Costas Bekas, Alessandro Curioni, Omar Ghattas. 5 [doi]
- BD-CATS: big data clustering at trillion particle scaleMd. Mostofa Ali Patwary, Surendra Byna, Nadathur Rajagopalan Satish, Narayanan Sundaram, Zarija Lukic, Vadim Roytershteyn, Michael J. Anderson, Yushu Yao, Prabhat, Pradeep Dubey. 6 [doi]
- Performance optimization for the k-nearest neighbors kernel on x86 architecturesChenhan D. Yu, Jianyu Huang, Woody Austin, Bo Xiao, George Biros. 7 [doi]
- Massively parallel phase-field simulations for ternary eutectic directional solidificationMartin Bauer, Johannes Hötzer, Marcus Jainta, Philipp Steinmetz, Marco Berghoff, Florian Schornbaum, Christian Godenschwager, Harald Köstler, Britta Nestler, Ulrich Rüde. 8 [doi]
- Parallel implementation and performance optimization of the configuration-interaction methodHongzhang Shan, Samuel Williams, Calvin W. Johnson, Kenneth McElvain, W. Erich Ormand. 9 [doi]
- Efficient implementation of quantum materials simulations on distributed CPU-GPU systemsRaffaele Solcà, Anton Kozhevnikov, Azzam Haidar, Stanimire Tomov, Jack Dongarra, Thomas C. Schulthess. 10 [doi]
- Runtime-driven shared last-level cache management for task-parallel programsAbhisek Pan, Vijay S. Pai. 11 [doi]
- Frugal ECC: efficient and versatile memory error protection through fine-grained compressionJungrae Kim, Michael Sullivan, Seong-Lyong Gong, Mattan Erez. 12 [doi]
- Automatic sharing classification and timely push for cache-coherent systemsMalek Musleh, Vijay S. Pai. 13 [doi]
- HipMer: an extreme-scale de novo genome assemblerEvangelos Georganas, Aydin Buluç, Jarrod Chapman, Steven Hofmeyr, Chaitanya Aluru, Rob Egan, Leonid Oliker, Daniel Rokhsar, Katherine A. Yelick. 14 [doi]
- A parallel connectivity algorithm for de Bruijn graphs in metagenomic applicationsPatrick Flick, Chirag Jain, Tony Pan, Srinivas Aluru. 15 [doi]
- Parallel distributed memory construction of suffix and longest common prefix arraysPatrick Flick, Srinivas Aluru. 16 [doi]
- Adaptive and transparent cache bypassing for GPUsAng Li, Gert-Jan van den Braak, Akash Kumar, Henk Corporaal. 17 [doi]
- ELF: maximizing memory-level parallelism for GPUs with coordinated warp and fetch schedulingJason Jong Kyu Park, Yongjun Park, Scott A. Mahlke. 18 [doi]
- Memory access patterns: the missing piece of the multi-GPU puzzleTal Ben-Nun, Ely Levy, Amnon Barak, Eri Rubin. 19 [doi]
- AnalyzeThis: an analysis workflow-aware storage systemHyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Devesh Tiwari, Ali Anwar, Ali Raza Butt, Lavanya Ramakrishnan. 20 [doi]
- Mantle: a programmable metadata load balancer for the ceph file systemMichael A. Sevilla, Noah Watkins, Carlos Maltzahn, Ike Nassi, Scott A. Brandt, Sage A. Weil, Greg Farnum, Sam Fineberg. 21 [doi]
- HydraDB: a resilient RDMA-driven key-value middleware for in-memory cluster computingYandong Wang, Li Zhang, Jian Tan, Min Li, Yuqing Gao, Xavier Guerin, Xiaoqiao Meng, Shicong Meng. 22 [doi]
- Full correlation matrix analysis of fMRI data on Intel® Xeon Phi™ coprocessorsYida Wang, Michael J. Anderson, Jonathan D. Cohen, Alexander Heinecke, Kai Li, Nadathur Satish, Narayanan Sundaram, Nicholas B. Turk-Browne, Theodore L. Willke. 23 [doi]
- A kernel-independent FMM in general dimensionsWilliam B. March, Bo Xiao, Sameer Tharakan, Chenhan D. Yu, George Biros. 24 [doi]
- Engineering inhibitory proteins with InSiPS: the in-silico protein synthesizerAndrew Schoenrock, Daniel Burnside, Houman Moteshareie, Alex Wong, Ashkan Golshani, Frank Dehne. 25 [doi]
- Exploring network optimizations for large-scale graph analyticsXinyu Que, Fabio Checconi, Fabrizio Petrini, Xing Liu, Daniele Buono. 26 [doi]
- GossipMap: a distributed community detection algorithm for billion-edge directed graphsSeung-Hee Bae, Bill Howe. 27 [doi]
- GraphReduce: processing large-scale graphs on accelerator-based systemsDipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, Karsten Schwan. 28 [doi]
- A case for application-oblivious energy-efficient MPI runtimeAkshay Venkatesh, Abhinav Vishnu, Khaled Hamidouche, Nathan R. Tallent, Dhabaleswar K. Panda, Darren J. Kerbyson, Adolfy Hoisie. 29 [doi]
- Improving concurrency and asynchrony in multithreaded MPI applications using software offloadingKarthikeyan Vaidyanathan, Dhiraj D. Kalamkar, Kiran Pamnany, Jeff R. Hammond, Pavan Balaji, Dipankar Das 0002, JongSoo Park, Bálint Joó. 30 [doi]
- Practical scalable consensus for pseudo-synchronous distributed systemsThomas Hérault, Aurelien Bouteiller, George Bosilca, Marc Gamell, Keita Teranishi, Manish Parashar, Jack Dongarra. 31 [doi]
- Monetary cost optimizations for MPI-based HPC applications on Amazon clouds: checkpoints and replicated executionYifan Gong, Bingsheng He, Amelie Chi Zhou. 32 [doi]
- Elastic job bundling: an adaptive resource request strategy for large-scale parallel applicationsFeng Liu, Jon B. Weissman. 33 [doi]
- Fault tolerant MapReduce-MPI for HPC clustersYanfei Guo, Wesley Bland, Pavan Balaji, Xiaobo Zhou. 34 [doi]
- Network endpoint congestion control for fine-grained communicationNan Jiang, Larry R. Dennison, William J. Dally. 35 [doi]
- Cost-effective diameter-two topologies: analysis and evaluationGeorgios Kathareios, Cyriel Minkenberg, Bogdan Prisacari, Germán Rodríguez, Torsten Hoefler. 36 [doi]
- Profile-based power shifting in interconnection networks with on/off linksShinobu Miwa, Hiroshi Nakamura. 37 [doi]
- Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facilityDevesh Tiwari, Saurabh Gupta, George Gallarno, Jim Rogers, Don Maxwell. 38 [doi]
- Big omics data experiencePatricia A. Kovatch, Anthony Costa, Zachary Giles, Eugene Fluder, Hyung Min Cho, Svetlana Mazurkova. 39 [doi]
- The Spack package manager: bringing order to HPC software chaosTodd Gamblin, Matthew P. LeGendre, Michael R. Collette, Gregory L. Lee, Adam Moody, Bronis R. de Supinski, Scott Futral. 40 [doi]
- STELLA: a domain-specific tool for structured grid methods in weather and climate modelsTobias Gysi, Carlos Osuna, Oliver Fuhrer, Mauro Bianco, Thomas C. Schulthess. 41 [doi]
- Improving the scalability of the ocean barotropic solver in the community earth system modelYong Hu, Xiaomeng Huang, Allison H. Baker, Yu-Heng Tseng, Frank O. Bryan, John M. Dennis, Guangwen Yang. 42 [doi]
- Particle tracking in open simulation laboratoriesKalin Kanov, Randal C. Burns. 43 [doi]
- Energy-aware data transfer algorithmsIsmail Alan, Engin Arslan, Tevfik Kosar. 44 [doi]
- IOrchestra: supporting high-performance data-intensive applications in the cloud via collaborative virtualizationRon Chi-Lung Chiang, H. Howie Huang, Timothy Wood, Changbin Liu, Oliver Spatscheck. 45 [doi]
- An elegant sufficiency: load-aware differentiated scheduling of data transfersRajkumar Kettimuthu, Gayane Vardoyan, Gagan Agrawal, P. Sadayappan, Ian T. Foster. 46 [doi]
- ScaAnalyzer: a tool to identify memory scalability bottlenecks in parallel programsXu Liu, Bo Wu. 47 [doi]
- 2-bound: a capacity and concurrency driven analytical model for many-core designYu-Hang Liu, Xian-He Sun. 48 [doi]
- Recovering logical structure from Charm++ event tracesKatherine E. Isaacs, Abhinav Bhatele, Jonathan Lifflander, David Böhme, Todd Gamblin, Martin Schulz, Bernd Hamann, Peer-Timo Bremer. 49 [doi]
- Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approachChristopher Sewell, Katrin Heitmann, Hal Finkel, George Zagaris, Suzanne Parete-Koon, Patricia K. Fasel, Adrian Pope, Nicholas Frontiere, Li-Ta Lo, O. E. Bronson Messer, Salman Habib, James P. Ahrens. 50 [doi]
- Smart: a MapReduce-like framework for in-situ scientific analyticsYi Wang, Gagan Agrawal, Tekin Bicer, Wei Jiang. 51 [doi]
- Optimal scheduling of in-situ analysis for large-scale scientific simulationsPreeti Malakar, Venkatram Vishwanath, Todd Munson, Christopher Knight, Mark Hereld, Sven Leyffer, Michael E. Papka. 52 [doi]
- Exploiting asynchrony from exact forward recovery for DUE in iterative solversLuc Jaulmes, Eduard Ayguadé, Marc Casas, Jesús Labarta, Miquel Moretó, Mateo Valero. 53 [doi]
- High-performance algebraic multigrid solver optimized for multi-core based distributed parallel systemsJongSoo Park, Mikhail Smelyanskiy, Ulrike Meier Yang, Dheevatsa Mudigere, Pradeep Dubey. 54 [doi]
- STS-k: a multilevel sparse triangular solution scheme for NUMA multicoresHumayun Kabir, Joshua Dennis Booth, Guillaume Aupy, Anne Benoit, Yves Robert, Padma Raghavan. 55 [doi]
- Data partitioning strategies for graph workloads on heterogeneous clustersMichael LeBeane, Shuang Song, Reena Panda, Jee Ho Ryoo, Lizy K. John. 56 [doi]
- Scaling iterative graph computations with GraphMapKisung Lee, Ling Liu, Karsten Schwan, Calton Pu, Qi Zhang, Yang Zhou, Emre Yigitoglu, Pingpeng Yuan. 57 [doi]
- PGX.D: a fast distributed graph processing engineSungpack Hong, Siegfried Depner, Thomas Manhardt, Jan Van Der Lugt, Merijn Verstraaten, Hassan Chafi. 58 [doi]
- Randomized algorithms to update partial singular value decomposition on a hybrid CPU/GPU clusterIchitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack Dongarra. 59 [doi]
- Performance of random sampling for computing low-rank approximations of a dense matrix on GPUsTheo Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack Dongarra. 60 [doi]
- CIVL: the concurrency intermediate verification languageStephen F. Siegel, Manchun Zheng, Ziqing Luo, Timothy K. Zirkel, Andre V. Marianiello, John G. Edenhofner, Matthew B. Dwyer, Michael S. Rogers. 61 [doi]
- Clock delta compression for scalable order-replay of non-deterministic parallel applicationsKento Sato, Dong H. Ahn, Ignacio Laguna, Gregory L. Lee, Martin Schulz. 62 [doi]
- Relative debugging for a highly parallel hybrid computer systemLuiz De Rose, Andrew Gontarek, Aaron Vose, Robert Moench, David Abramson, Minh Ngoc Dinh, Chao Jin. 63 [doi]
- Improving backfilling by using machine learning to predict running timesÉric Gaussier, David Glesser, Valentin Reis, Denis Trystram. 64 [doi]
- Adaptive data placement for staging-based coupled scientific workflowsQian Sun, Tong Jin, Melissa Romanus, Hoang Bui, Fan Zhang, Hongfeng Yu, Hemanth Kolla, Scott Klasky, Jacqueline Chen, Manish Parashar. 65 [doi]
- Multi-objective job placement in clustersSergey Blagodurov, Alexandra Fedorova, Evgeny Vinnik, Tyler Dwyer, Fabien Hermenier. 66 [doi]
- A work-efficient algorithm for parallel unordered depth-first searchUmut A. Acar, Arthur Charguéraud, Mike Rainey. 67 [doi]
- Enterprise: breadth-first graph traversal on GPUsHang Liu, H. Howie Huang. 68 [doi]
- GraphBIG: understanding graph computing in the context of industrial solutionsLifeng Nai, Yinglong Xia, Ilie Gabriel Tanase, Hyesoon Kim, Ching-Yung Lin. 69 [doi]
- Local recovery and failure masking for stencil-based applications at extreme scalesMarc Gamell, Keita Teranishi, Michael A. Heroux, Jackson Mayo, Hemanth Kolla, Jacqueline Chen, Manish Parashar. 70 [doi]
- VOCL-FT: introducing techniques for efficient soft error coprocessor recoveryAntonio J. Peña, Wesley Bland, Pavan Balaji. 71 [doi]
- Understanding the propagation of transient errors in HPC applicationsRizwan Ashraf, Roberto Gioiosa, Gokcen Kestor, Ronald F. DeMara, Chen-Yong Cher, Pradip Bose. 72 [doi]
- Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance resultsTorsten Hoefler, Roberto Belli. 73 [doi]
- Node variability in large-scale power measurements: perspectives from the Green500, Top500 and EEHPCWGThomas Scogland, Jonathan Azose, David Rohr, Suzanne Rivoire, Natalie Bates, Daniel Hackenberg. 74 [doi]
- A practical approach to reconciling availability, performance, and capacity in provisioning extreme-scale storage systemsLipeng Wan, Feiyi Wang, Sarp Oral, Devesh Tiwari, Sudharshan S. Vazhkudai, Qing Cao. 75 [doi]
- An input-adaptive and in-place approach to dense tensor-times-matrix multiplyJiajia Li, Casey Battaglino, Ioakeim Perros, Jimeng Sun, Richard W. Vuduc. 76 [doi]
- Scalable sparse tensor decompositions in distributed memory systemsOguz Kaya, Bora Uçar. 77 [doi]
- Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputingYuichi Inadomi, Tapasya Patki, Koji Inoue, Mutsumi Aoyagi, Barry Rountree, Martin Schulz, David K. Lowenthal, Yasutaka Wada, Keiichiro Fukazawa, Masatsugu Ueda, Masaaki Kondo, Ikuo Miyoshi. 78 [doi]
- Finding the limits of power-constrained application performancePeter E. Bailey, Aniruddha Marathe, David K. Lowenthal, Barry Rountree, Martin Schulz. 79 [doi]
- Dynamic power sharing for higher job throughputDaniel A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz. 80 [doi]
- Regent: a high-productivity programming language for HPC with logical regionsElliott Slaughter, Wonchan Lee, Sean Treichler, Michael Bauer, Alex Aiken. 81 [doi]
- Bridging OpenCL and CUDA: a comparative analysis and translationJunghyun Kim, Thanh Tuan Dao, Jaehoon Jung, Jinyoung Joo, Jaejin Lee. 82 [doi]
- CilkSpec: optimistic concurrency for CilkShaizeen Aga, Sriram Krishnamoorthy, Satish Narayanasamy. 83 [doi]