Abstract is missing.
- Taking a quantum leap in time to solution for simulations of high-Tc superconductorsPeter Staar, Thomas A. Maier, Michael S. Summers, Gilles Fourestey, Raffaele Solcà, Thomas C. Schulthess. 1 [doi]
- 20 petaflops simulation of proteins suspensions in crowding conditionsMassimo Bernaschi, Mauro Bisson, Massimiliano Fatica, Simone Melchionna. 2 [doi]
- 11 PFLOP/s simulations of cloud cavitation collapseDiego Rossinelli, Babak Hejazialhosseini, Panagiotis E. Hadjidoukas, Costas Bekas, Alessandro Curioni, Adam Bertsch, Scott Futral, Steffen J. Schmidt, Nikolaus A. Adams, Petros Koumoutsakos. 3 [doi]
- The origin of massPeter A. Boyle, Michael I. Buchoff, Norman H. Christ, Taku Izubuchi, Chulwoo Jung, Thomas C. Luu, Robert D. Mawhinney, Chris Schroeder, Ron Soltz, Pavlos Vranas, Joseph Wasem. 4 [doi]
- Radiative signatures of the relativistic Kelvin-Helmholtz instabilityMichael Bussmann, Heiko Burau, Thomas E. Cowan, Alexander Debus, Alex Huebl, Guido Juckeland, Thomas Kluge, Wolfgang E. Nagel, Richard Pausch, Felix Schmitt, Ulrich Schramm, Joseph Schuchart, René Widera. 5 [doi]
- HACC: extreme scaling and performance across diverse architecturesSalman Habib, Vitali A. Morozov, Nicholas Frontiere, Hal Finkel, Adrian Pope, Katrin Heitmann. 6 [doi]
- ACR: automatic checkpoint/restart for soft and hard error protectionXiang Ni, Esteban Meneses, Nikhil Jain, Laxmikant V. Kalé. 7 [doi]
- SPBC: leveraging the characteristics of MPI HPC applications for scalable checkpointingThomas Ropars, Tatiana V. Martsinkevich, Amina Guermouche, André Schiper, Franck Cappello. 8 [doi]
- Using simulation to explore distributed key-value stores for extreme-scale system servicesKe Wang, Abhishek Kulkarni, Michael Lang, Dorian Arnold, Ioan Raicu. 9 [doi]
- General transformations for GPU execution of tree traversalsMichael Goldfarb, Youngjoon Jo, Milind Kulkarni. 10 [doi]
- A large-scale cross-architecture evaluation of thread-coarseningAlberto Magni, Christophe Dubach, Michael F. P. O'Boyle. 11 [doi]
- Semi-automatic restructuring of offloadable tasks for many-core acceleratorsNishkam Ravi, Yi Yang, Tao Bao, Srimat T. Chakradhar. 12 [doi]
- A framework for load balancing of tensor contraction expressions via dynamic task partitioningPai-Wei Lai, Kevin Stock, Samyam Rajbhandari, Sriram Krishnamoorthy, P. Sadayappan. 13 [doi]
- Load-balanced pipeline parallelismMd Kamruzzaman, Steven Swanson, Dean M. Tullsen. 14 [doi]
- A distributed dynamic load balancer for iterative applicationsHarshitha Menon, Laxmikant V. Kalé. 15 [doi]
- Distributed wait state tracking for runtime MPI deadlock detectionTobias Hilbrich, Bronis R. de Supinski, Wolfgang E. Nagel, Joachim Protze, Christel Baier, Matthias S. Müller. 16 [doi]
- Globalizing selectively: shared-memory efficiency with address-space separationNilesh Mahajan, Uday Pitambare, Arun Chauhan. 17 [doi]
- Hybrid MPI: efficient message passing for multi-core systemsAndrew Friedley, Greg Bronevetsky, Torsten Hoefler, Andrew Lumsdaine. 18 [doi]
- Performance evaluation of Intel® transactional synchronization extensions for high-performance computingRichard M. Yoo, Christopher J. Hughes, Konrad Lai, Ravi Rajwar. 19 [doi]
- Location-aware cache management for many-core processors with deep cache hierarchyJongSoo Park, Richard M. Yoo, Daya Shanker Khudia, Christopher J. Hughes, Daehyun Kim. 20 [doi]
- Practical nonvolatile multilevel-cell phase change memoryDoe Hyun Yoon, Jichuan Chang, Robert S. Schreiber, Norman P. Jouppi. 21 [doi]
- Feng shui of supercomputer memory: positional effects in DRAM and SRAM faultsVilas Sridharan, Jon Stearley, Nathan DeBardeleben, Sean Blanchard, Sudhanva Gurumurthi. 22 [doi]
- Exploring DRAM organizations for energy-efficient and resilient exascale memoriesBharan Giridhar, Michael Cieslak, Deepankar Duggal, Ronald G. Dreslinski, Hsing-Min Chen, Robert Patti, Betina Hold, Chaitali Chakrabarti, Trevor N. Mudge, David Blaauw. 23 [doi]
- Low-power, low-storage-overhead chipkill correct via multi-line error correctionXun Jian, Henry Duwe, John Sartori, Vilas Sridharan, Rakesh Kumar. 24 [doi]
- AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUsQian Wang, Xianyi Zhang, Yunquan Zhang, Qing Yi. 25 [doi]
- Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemesWai Teng Tang, Wen Jun Tan, Rajarshi Ray, Yi Wen Wong, Weiguang Chen, Shyh-hao Kuo, Rick Siow Mong Goh, Stephen John Turner, Weng-Fai Wong. 26 [doi]
- Precimonious: tuning assistant for floating-point precisionCindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H. Bailey, Costin Iancu, David Hough. 27 [doi]
- A data-centric profiler for parallel programsXu Liu, John M. Mellor-Crummey. 28 [doi]
- On the usefulness of object tracking techniques in performance analysisGermán Llort, Harald Servat, Juan Gonzalez, Judit Gimenez, Jesús Labarta. 29 [doi]
- Detection of false sharing using machine learningSanath Jayasena, Saman Amarasinghe, Asanka Abeyweera, Gayashan Amarasinghe, Himeshi De Silva, Sunimal Rathnayake, Xiaoqiao Meng, Yanbin Liu. 30 [doi]
- Parallelizing the execution of sequential scriptsZhao Zhang, Daniel S. Katz, Timothy G. Armstrong, Justin M. Wozniak, Ian T. Foster. 31 [doi]
- Deterministic scale-free pipeline parallelism with hyperqueuesHans Vandierendonck, Kallia Chronaki, Dimitrios S. Nikolopoulos. 32 [doi]
- Compiling affine loop nests for distributed-memory parallel architecturesUday Bondhugula. 33 [doi]
- Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessorsJongSoo Park, Ganesh Bikshandi, Karthikeyan Vaidyanathan, Ping Tak Peter Tang, Pradeep Dubey, Daehyun Kim. 34 [doi]
- A framework for hybrid parallel flow simulations with a trillion cells in complex geometriesChristian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler, Ulrich Rüde. 35 [doi]
- A new routing scheme for Jellyfish and its performance with HPC workloadsXin Yuan, Santosh Mahapatra, Wickus Nienaber, Scott Pakin, Michael Lang 0003. 36 [doi]
- Enabling fair pricing on HPC systems with node sharingAlex D. Breslow, Ananta Tiwari, Martin Schulz, Laura Carrington, Lingjia Tang, Jason Mars. 37 [doi]
- ACIC: automatic cloud I/O configurator for HPC applicationsMingliang Liu, Ye Jin, Jidong Zhai, Yan Zhai, Qianqian Shi, Xiaosong Ma, Wenguang Chen. 38 [doi]
- COCA: online distributed resource management for cost minimization and carbon neutrality in data centersShaolei Ren, Yuxiong He. 39 [doi]
- Supercomputing with commodity CPUs: are mobile SoCs ready for HPC?Nikola Rajovic, Paul M. Carpenter, Isaac Gelado, Nikola Puzovic, Alex Ramírez, Mateo Valero. 40 [doi]
- There goes the neighborhood: performance degradation due to nearby jobsAbhinav Bhatele, Kathryn Mohror, Steve H. Langer, Katherine E. Isaacs. 41 [doi]
- CooMR: cross-task coordination for efficient data management in MapReduce programsXiaobing Li, Yandong Wang, Yizheng Jiao, Cong Xu, Weikuan Yu. 42 [doi]
- Effective sampling-driven performance tools for GPU-accelerated supercomputersMilind Chabbi, Karthik Murthy, Michael Fagan, John M. Mellor-Crummey. 43 [doi]
- Rethinking algorithm-based fault tolerance with a cooperative software-hardware approachDong Li, Zizhong Chen, Panruo Wu, Jeffrey S. Vetter. 44 [doi]
- Using automated performance modeling to find scalability bugs in complex codesAlexandru Calotoiu, Torsten Hoefler, Marius Poke, Felix Wolf. 45 [doi]
- Efficient data partitioning model for heterogeneous graphs in the cloudKisung Lee, Ling Liu. 46 [doi]
- SDQuery DSI: integrating data management support with a wide area data transfer protocolYu Su, Yi Wang, Gagan Agrawal, Rajkumar Kettimuthu. 47 [doi]
- Design and performance evaluation of NUMA-aware RDMA-based end-to-end data transfer systemsYufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas G. Robertazzi. 48 [doi]
- Scalable parallel OPTICS data clustering using graph algorithmic techniquesMd. Mostofa Ali Patwary, Diana Palsetia, Ankit Agrawal, Wei-keng Liao, Fredrik Manne, Alok N. Choudhary. 49 [doi]
- Scalable matrix computations on large scale-free graphs using 2D graph partitioningErik G. Boman, Karen D. Devine, Sivasankaran Rajamanickam. 50 [doi]
- Scalable parallel graph partitioningShad Kirmani, Padma Raghavan. 51 [doi]
- Channel reservation protocol for over-subscribed channels and destinationsGeorge Michelogiannakis, Nan Jiang, Daniel U. Becker, William J. Dally. 52 [doi]
- Enabling highly-scalable remote memory access programming with MPI-3 one sidedRobert Gerstenberger, Maciej Besta, Torsten Hoefler. 53 [doi]
- MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for intel MIC clustersSreeram Potluri, Devendar Bureddy, Khaled Hamidouche, Akshay Venkatesh, Krishna Chaitanya Kandalla, Hari Subramoni, Dhabaleswar K. Panda. 54 [doi]
- Exploring portfolio scheduling for long-term execution of scientific workloads in IaaS cloudsKefeng Deng, Junqiang Song, Kaijun Ren, Alexandru Iosup. 55 [doi]
- Cost-effective cloud HPC resource provisioning by building semi-elastic virtual clustersShuangcheng Niu, Jidong Zhai, Xiaosong Ma, Xiongchao Tang, Wenguang Chen. 56 [doi]
- Exploiting application dynamism and cloud elasticity for continuous dataflowsAlok Gautam Kumbhare, Yogesh Simmhan, Viktor K. Prasanna. 57 [doi]
- A 'cool' way of improving the reliability of HPC machinesOsman Sarood, Esteban Meneses, Laxmikant V. Kalé. 58 [doi]
- Coordinated energy management in heterogeneous processorsIndrani Paul, Vignesh T. Ravi, Srilatha Manne, Manish Arora, Sudhakar Yalamanchili. 59 [doi]
- Integrating dynamic pricing of electricity into energy aware scheduling for HPC systemsXu Yang, Zhou Zhou, Sean Wallace, Zhiling Lan, Wei Tang, Susan Coghlan, Michael E. Papka. 60 [doi]
- Petascale direct numerical simulation of turbulent channel flow on up to 786K coresMyoungkyu Lee, Nicholas Malaya, Robert D. Moser. 61 [doi]
- Solving the compressible navier-stokes equations on up to 1.97 million cores and 4.1 trillion grid pointsIván Bermejo-Moreno, Julien Bodart, Johan Larsson, Blaise M. Barney, Joseph W. Nichols, Steve Jones. 62 [doi]
- Petascale WRF simulation of hurricane Sandy deployment of NCSA's cray XE6 blue watersPeter Johnsen, Mark Straka, Melvyn Shapiro, Alan Norton, Thomas Galarneau. 63 [doi]
- Optimization of cloud task processing with checkpoint-restart mechanismSheng Di, Yves Robert, Frédéric Vivien, Derrick Kondo, Cho-Li Wang, Franck Cappello. 64 [doi]
- Scalable virtual machine deployment using VM image cachesKaveh Razavi, Thilo Kielmann. 65 [doi]
- Guide-copy: fast and silent migration of virtual machine for datacentersJihun Kim, Dongju Chae, Jangwoo Kim, Jong Kim. 66 [doi]
- Characterization and modeling of PIDX parallel I/O for performance optimizationSidharth Kumar, Avishek Saha, Venkatram Vishwanath, Philip H. Carns, John A. Schmidt, Giorgio Scorzelli, Hemanth Kolla, Ray W. Grout, Robert Latham, Robert B. Ross, Michael E. Papka, Jacqueline Chen, Valerio Pascucci. 67 [doi]
- Taming parallel I/O complexity with auto-tuningBabak Behzad, Huong Vu Thanh Luu, Joseph Huchette, Surendra Byna, Prabhat, Ruth A. Aydt, Quincey Koziol, Marc Snir. 68 [doi]
- Toward millions of file system IOPS on low-cost, commodity hardwareDa Zheng, Randal C. Burns, Alexander S. Szalay. 69 [doi]
- Physics-based seismic hazard analysis on petascale heterogeneous supercomputersYifeng Cui, Efecan Poyraz, Kim B. Olsen, Jun Zhou, Kyle Withers, Scott Callaghan, Jeff Larkin, Clark C. Guest, Dong Ju Choi Choi, Amit Chourasia, Zheqiang Shi, Steven M. Day, Philip Maechling, Thomas H. Jordan. 70 [doi]
- n-tuple computation in many-body molecular dynamics simulationManaschai Kunaseth, Rajiv K. Kalia, Aiichiro Nakano, Ken-ichi Nomura, Priya Vashishta. 71 [doi]
- 2HOT: an improved parallel hashed oct-tree n-body algorithm for cosmological simulationMichael S. Warren. 72 [doi]
- SIDR: structure-aware intelligent data routing in HadoopJoe B. Buck, Noah Watkins, Greg Levin, Adam Crume, Kleoni Ioannidou, Scott A. Brandt, Carlos Maltzahn, Neoklis Polyzotis, Aaron Torres. 73 [doi]
- Using cross-layer adaptations for dynamic data management in large scale coupled scientific workflowsTong Jin, Fan Zhang, Qian Sun, Hoang Bui, Manish Parashar, Hongfeng Yu, Scott Klasky, Norbert Podhorszki, Hasan Abbasi. 74 [doi]
- Exploring the future of out-of-core computing with compute-local non-volatile memoryMyoungsoo Jung, Ellis Herbert Wilson, Wonil Choi, John Shalf, Hasan Metin Aktulga, Chao Yang, Erik Saule, Ïmit V. Çatalyürek, Mahmut T. Kandemir. 75 [doi]
- Assessing the effects of data compression in simulations using physically motivated metricsDaniel E. Laney, Steven Langer, Christopher Weber, Peter Lindstrom, Al Wegener. 76 [doi]
- Exploring power behaviors and trade-offs of in-situ data analyticsMarc Gamell, Ivan Rodero, Manish Parashar, Janine Bennett, Hemanth Kolla, Jacqueline Chen, Peer-Timo Bremer, Aaditya G. Landge, Attila Gyulassy, Patrick McCormick, Scott Pakin, Valerio Pascucci, Scott Klasky. 77 [doi]
- GoldRush: resource efficient in situ scientific data analytics using fine-grained interference aware executionFang Zheng, Hongfeng Yu, Can Hantas, Matthew Wolf, Greg Eisenhauer, Karsten Schwan, Hasan Abbasi, Scott Klasky. 78 [doi]
- A scalable, efficient scheme for evaluation of stencil computations over unstructured meshesJames King 0007, Robert M. Kirby. 79 [doi]
- Scalable domain decomposition preconditioners for heterogeneous elliptic problemsPierre Jolivet, Frédéric Hecht, Frédéric Nataf, Christophe Prud'homme. 80 [doi]
- Parallel design and performance of nested filtering factorization preconditionerLong Qu, Laura Grigori, Frédéric Nataf. 81 [doi]
- Kinetic turbulence simulations at extreme scale on leadership-class systemsBei Wang, Stéphane Ethier, William M. Tang, Timothy Williams, Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker. 82 [doi]
- Swendsen-Wang multi-cluster algorithm for the 2D/3D Ising model on Xeon Phi and GPUFlorian Wende, Thomas Steinke. 83 [doi]
- Mr. Scan: extreme scale density-based clustering using a tree-based network of GPGPU nodesBenjamin Welton, Evan Samanas, Barton P. Miller. 84 [doi]
- The Science DMZ: a network design pattern for data-intensive scienceEli Dart, Lauren Rotman, Brian Tierney, Mary Hester, Jason Zurawski. 85 [doi]
- Enabling comprehensive data-driven system management for large computational facilitiesJames C. Browne, Robert L. DeLeon, Charng-da Lu, Matthew D. Jones, Steven M. Gallo, Amin Ghadersohi, Abani K. Patra, William L. Barth, John Hammond, Thomas R. Furlani, Robert T. McLay. 86 [doi]
- Insights for exascale IO APIs from building a petascale IO APIJay F. Lofstead, Robert Ross. 87 [doi]
- Parallel reduction to hessenberg form with algorithm-based fault toleranceYulu Jia, George Bosilca, Piotr Luszczek, Jack J. Dongarra. 88 [doi]
- A computationally efficient algorithm for the 2D covariance methodOded Green, Yitzhak Birk. 89 [doi]
- An improved parallel singular value algorithm and its implementation for multicore hardwareAzzam Haidar, Jakub Kurzak, Piotr Luszczek. 90 [doi]
- Distributed-memory parallel algorithms for generating massive scale-free networks using preferential attachment modelMd. Maksudul Alam, Maleq Khan, Madhav V. Marathe. 91 [doi]
- On fast parallel detection of strongly connected components (SCC) in small-world graphsSungpack Hong, Nicole C. Rodia, Kunle Olukotun. 92 [doi]
- Algorithms for high-throughput disk-to-disk sortingHari Sundar, Dhairya Malhotra, Karl W. Schulz. 93 [doi]
- An early performance evaluation of many integrated core architecture based SGI rackable computing systemSubhash Saini, Haoqiang Jin, Dennis C. Jespersen, Huiyu Feng, M. Jahed Djomehri, William Arasin, Robert Hood, Piyush Mehrotra, Rupak Biswas. 94 [doi]
- Predicting application performance using supervised learning on communication featuresNikhil Jain, Abhinav Bhatele, Michael P. Robson, Todd Gamblin, Laxmikant V. Kalé. 95 [doi]
- Investigating applications portability with the Uintah DAG-based runtime system on PetaScale supercomputersQingyu Meng, Alan Humphrey, John Schmidt, Martin Berzins. 96 [doi]