Abstract is missing.
- Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systemsJatin Chhugani, Changkyu Kim, Hemant Shukla, JongSoo Park, Pradeep Dubey, John Shalf, Horst D. Simon. 1 [doi]
- Toward real-time modeling of human heart ventricles at cellular resolution: simulation of drug-induced arrhythmiasArthur A. Mirin, David F. Richards, James N. Glosli, Erik W. Draeger, Bor Chan, Jean-Luc Fattebert, William D. Krauss, Tomas Oppelstrup, John Jeremy Rice, John A. Gunnels, Viatcheslav Gurev, Changhoan Kim, John Magerlein, Matthias Reumann, Hui-Fang Wen. 2 [doi]
- Extreme-scale UQ for Bayesian inverse problems governed by PDEsTan Bui-Thanh, Carsten Burstedde, Omar Ghattas, James Martin, Georg Stadler, Lucas C. Wilcox. 3 [doi]
- The universe at extreme scale: multi-petaflop sky simulation on the BG/QSalman Habib, Vitali A. Morozov, Hal Finkel, Adrian Pope, Katrin Heitmann, Kalyan Kumaran, Tom Peterka, Joseph A. Insley, David Daniel, Patricia K. Fasel, Nicholas Frontiere, Zarija Lukic. 4 [doi]
- N-body simulation on K computer: the gravitational trillion-body problemTomoaki Ishiyama, Keigo Nitadori, Junichiro Makino. 5 [doi]
- Demonstrating lustre over a 100Gbps wide area network of 3, 500kmRobert Henschel, Stephen C. Simms, David Y. Hancock, Scott Michael, Tom Johnson, Nathan Heald, Thomas William, Donald K. Berry, Matthew Allen, Richard Knepper, Matt Davy, Matthew R. Link, Craig A. Stewart. 6 [doi]
- A study on data deduplication in HPC storage systemsDirk Meister, Jürgen Kaiser, André Brinkmann, Toni Cortes, Michael Kuhn, Julian M. Kunkel. 7 [doi]
- Characterizing output bottlenecks in a supercomputerBing Xie, Jeffrey Chase, David Dillow, Oleg Drokin, Scott Klasky, Sarp Oral, Norbert Podhorszki. 8 [doi]
- Portable section-level tuning of compiler parallelized applicationsDheya Mustafa, Rudolf Eigenmann. 9 [doi]
- A multi-objective auto-tuning framework for parallel codesHerbert Jordan, Peter Thoman, Juan Jose Durillo Barrionuevo, Simone Pellegrini, Philipp Gschwandtner, Thomas Fahringer, Hans Moritsch. 10 [doi]
- Patus for convenient high-performance stencils: evaluation in earthquake simulationsMatthias Christen, Olaf Schenk, Yifeng Cui. 11 [doi]
- Direction-optimizing breadth-first searchScott Beamer, Krste Asanovic, David A. Patterson. 12 [doi]
- Breaking the speed and scalability barriers for graph exploration on distributed-memory machinesFabio Checconi, Fabrizio Petrini, Jeremiah Willcock, Andrew Lumsdaine, Anamitra R. Choudhury, Yogish Sabharwal. 13 [doi]
- Large-scale energy-efficient graph traversal: a path to efficient data-intensive supercomputingNadathur Satish, Changkyu Kim, Jatin Chhugani, Pradeep Dubey. 14 [doi]
- Hybridizing S3D into an exascale application using OpenACC: an approach for moving to multi-petaflops and beyondJohn M. Levesque, Ramanan Sankaran, Ray W. Grout. 15 [doi]
- High throughput software for direct numerical simulations of compressible two-phase flowsBabak Hejazialhosseini, Diego Rossinelli, Christian Conti, Petros Koumoutsakos. 16 [doi]
- McrEngine: a scalable checkpointing system using data-aware aggregation and compressionTanzima Zerin Islam, Kathryn Mohror, Saurabh Bagchi, Adam Moody, Bronis R. de Supinski, Rudolf Eigenmann. 17 [doi]
- Alleviating scalability issues of checkpointing protocolsRolf Riesen, Kurt B. Ferreira, Dilma Da Silva, Pierre Lemarinier, Dorian Arnold, Patrick G. Bridges. 18 [doi]
- Design and modeling of a non-blocking checkpointing systemKento Sato, Naoya Maruyama, Kathryn Mohror, Adam Moody, Todd Gamblin, Bronis R. de Supinski, Satoshi Matsuoka. 19 [doi]
- Scalia: an adaptive scheme for efficient multi-cloud storageThanasis G. Papaioannou, Nicolas Bonvin, Karl Aberer. 20 [doi]
- Host load prediction in a Google compute cloud with a Bayesian modelSheng Di, Derrick Kondo, Walfredo Cirne. 21 [doi]
- Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS cloudsMaciej Malawski, Gideon Juve, Ewa Deelman, Jarek Nabrzyski. 22 [doi]
- Early evaluation of directive-based GPU programming models for productive exascale computingSeyong Lee, Jeffrey S. Vetter. 23 [doi]
- Automatic generation of software pipelines for heterogeneous parallel systemsJacques A. Pienaar, Srimat T. Chakradhar, Anand Raghunathan. 24 [doi]
- Accelerating MapReduce on a coupled CPU-GPU architectureLinchuan Chen, Xin Huo, Gagan Agrawal. 25 [doi]
- Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPCFrancisco D. Igual, Murtaza Ali, Arnon Friedmann, Eric Stotzer, Timothy Wentz, Robert A. van de Geijn. 26 [doi]
- A scalable, numerically stable, high-performance tridiagonal solver using GPUsLi-Wen Chang, John A. Stratton, Hee-Seok Kim, Wen-mei W. Hwu. 27 [doi]
- Efficient backprojection-based synthetic aperture radar computation with many-core processorsJongSoo Park, Ping Tak Peter Tang, Mikhail Smelyanskiy, Daehyun Kim, Thomas Benson. 28 [doi]
- Parametric flows: automated behavior equivalencing for symbolic analysis of races in CUDA programsPeng Li, Guodong Li, Ganesh Gopalakrishnan. 29 [doi]
- MPI runtime error detection with MUST: advances in deadlock detectionTobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, Matthias S. Müller. 30 [doi]
- Novel views of performance data to analyze large-scale adaptive applicationsAbhinav Bhatele, Todd Gamblin, Katherine E. Isaacs, Brian T. N. Gunney, Martin Schulz, Peer-Timo Bremer, Bernd Hamann. 31 [doi]
- RAMZzz: rank-aware dram power management with dynamic migrations and demotionsDonghong Wu, Bingsheng He, Xueyan Tang, Jianliang Xu, Minyi Guo. 32 [doi]
- MAGE: adaptive granularity and ECC for resilient and power efficient memory systemsSheng Li, Doe Hyun Yoon, Ke Chen, Jishen Zhao, Jung Ho Ahn, Jay B. Brockman, Yuan Xie, Norman P. Jouppi. 33 [doi]
- Protocols for wide-area data-intensive applications: design and performance issuesYufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas G. Robertazzi, Brian Tierney, Eric Pouyoul. 34 [doi]
- High performance RDMA-based design of HDFS over InfiniBandNusrat S. Islam, Md. Wasi-ur-Rahman, Jithin Jose, Raghunath Rajachandrasekar, Hao Wang, Hari Subramoni, Chet Murthy, Dhabaleswar K. Panda. 35 [doi]
- Efficient and reliable network tomography in heterogeneous networks using BitTorrent broadcasts and clustering algorithmsKiril Dichev, Fergal Reid, Alexey L. Lastovetsky. 36 [doi]
- A divide and conquer strategy for scaling weather simulations with multiple regions of interestPreeti Malakar, Thomas George, Sameer Kumar 0001, Rashmi Mittal, Vijay Natarajan, Yogish Sabharwal, Vaibhav Saxena, Sathish S. Vadhiyar. 37 [doi]
- Forward and adjoint simulations of seismic wave propagation on emerging large-scale GPU architecturesMax Rietmann, Peter Messmer, Tarje Nissen-Meyer, Daniel Peter, Piero Basini, Dimitri Komatitsch, Olaf Schenk, Jeroen Tromp, Lapo Boschi, Domenico Giardini. 38 [doi]
- Bamboo: translating MPI applications to a latency-tolerant, data-driven formTan Nguyen, Pietro Cicotti, Eric J. Bylaska, Dan Quinlan, Scott B. Baden. 39 [doi]
- Tiling stencil computations to maximize parallelismVinayaka Bandishti, Irshad Pananilath, Uday Bondhugula. 40 [doi]
- Compiler-directed file layout optimization for hierarchical storage systemsWei Ding, Yuanrui Zhang, Mahmut T. Kandemir, Seung Woo Son. 41 [doi]
- A framework for low-communication 1-D FFTPing Tak Peter Tang, JongSoo Park, Daehyun Kim, Vladimir Petrov. 42 [doi]
- Parallel geometric-algebraic multigrid on unstructured forests of octreesHari Sundar, George Biros, Carsten Burstedde, Johann Rudi, Omar Ghattas, Georg Stadler. 43 [doi]
- Scalable multi-GPU 3-D FFT for TSUBAME 2.0 supercomputerAkira Nukada, Kento Sato, Satoshi Matsuoka. 44 [doi]
- Peta-scale lattice quantum chromodynamics on a blue gene/Q supercomputerJun Doi. 45 [doi]
- Massively parallel X-ray scattering simulationsAbhinav Sarje, Xiaoye S. Li, Slim Chourou, Elaine R. Chan, Alexander Hexemer. 46 [doi]
- High performance radiation transport simulations: preparing for TitanC. Baker, G. G. Davidson, T. M. Evans, S. Hamilton, J. Jarrell, W. Joubert. 47 [doi]
- Byte-precision level of detail processing for variable precision analyticsJohn Jenkins, Eric R. Schendel, Sriram Lakshminarasimhan, David A. Boyuka II, Terry Rogers, Stéphane Ethier, Robert B. Ross, Scott Klasky, Nagiza F. Samatova. 48 [doi]
- Combining in-situ and in-transit processing to enable extreme-scale scientific analysisJanine Bennett, Hasan Abbasi, Peer-Timo Bremer, Ray W. Grout, Attila Gyulassy, Tong Jin, Scott Klasky, Hemanth Kolla, Manish Parashar, Valerio Pascucci, Philippe P. Pébay, David C. Thompson, Hongfeng Yu, Fan Zhang, Jacqueline Chen. 49 [doi]
- Efficient data restructuring and aggregation for I/O acceleration in PIDXSidharth Kumar, Venkatram Vishwanath, Philip H. Carns, Joshua A. Levine, Robert Latham, Giorgio Scorzelli, Hemanth Kolla, Ray W. Grout, Robert B. Ross, Michael E. Papka, Jacqueline Chen, Valerio Pascucci. 50 [doi]
- Measuring interference between live datacenter applicationsMelanie Kambadur, Tipp Moseley, Rick Hank, Martha A. Kim. 51 [doi]
- T: a data-centric cooling energy costs reduction approach for big data analytics cloudRini T. Kaushik, Klara Nahrstedt. 52 [doi]
- ValuePack: value-based scheduling framework for CPU-GPU clustersVignesh T. Ravi, Michela Becchi, Gagan Agrawal, Srimat T. Chakradhar. 53 [doi]
- Compass: a scalable simulator for an architecture for cognitive computingRobert Preissl, Theodore M. Wong, Pallab Datta, Myron Flickner, Raghavendra Singh, Steven K. Esser, William P. Risk, Horst D. Simon, Dharmendra S. Modha. 54 [doi]
- Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6Yanhua Sun, Gengbin Zheng, Chao Mei, Eric J. Bohm, James C. Phillips, Laximant V. Kalé, Terry R. Jones. 55 [doi]
- Heuristic static load-balancing algorithm applied to the fragment molecular orbital methodYuri Alexeev, Ashutosh Mahajan, Sven Leyffer, Graham Fletcher, Dmitri G. Fedorov. 56 [doi]
- Classifying soft error vulnerabilities in extreme-scale scientific applications using a binary instrumentation toolDong Li, Jeffrey S. Vetter, Weikuan Yu. 57 [doi]
- Containment domains: a scalable, efficient, and flexible resilience scheme for exascale systemsJinsuk Chung, Ikhwan Lee, Michael Sullivan, Jee Ho Ryoo, Dong-Wan Kim, Doe Hyun Yoon, Larry Kaplan, Mattan Erez. 58 [doi]
- Parallel I/O, analysis, and visualization of a trillion particle simulationSurendra Byna, Jerry Chou, Oliver Rübel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, Kesheng Wu. 59 [doi]
- Data-intensive spatial filtering in large numerical simulation datasetsKalin Kanov, Randal C. Burns, Gregory L. Eyink, Charles Meneveau, Alexander S. Szalay. 60 [doi]
- Parallel particle advection and FTLE computation for time-varying flow fieldsBoonthanome Nouanesengsy, Teng-Yok Lee, Kewei Lu, Han-Wei Shen, Tom Peterka. 61 [doi]
- A new scalable parallel DBSCAN algorithm using the disjoint-set data structureMd. Mostofa Ali Patwary, Diana Palsetia, Ankit Agrawal, Wei-keng Liao, Fredrik Manne, Alok N. Choudhary. 62 [doi]
- Parallel Bayesian network structure learning with application to gene networksOlga Nikolova, Srinivas Aluru. 63 [doi]
- A multithreaded algorithm for network alignment via approximate matchingArif M. Khan, David F. Gleich, Alex Pothen, Mahantesh Halappanavar. 64 [doi]
- Characterizing and mitigating work time inflation in task parallel programsStephen Olivier, Bronis R. de Supinski, Martin Schulz, Jan F. Prins. 65 [doi]
- Legion: expressing locality and independence with logical regionsMichael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken. 66 [doi]
- Designing a unified programming model for heterogeneous machinesMichael Garland, Manjunath Kudlur, Yili Zheng. 67 [doi]
- Design and implementation of an intelligent end-to-end network QoS systemSushant Sharma, Dimitrios Katramatos, Dantong Yu, Li Shi. 68 [doi]
- Looking under the hood of the IBM blue gene/Q networkDong Chen, Noel Eisley, Philip Heidelberger, Sameer Kumar 0001, Amith R. Mamidala, Fabrizio Petrini, Robert M. Senger, Yutaka Sugawara, Robert Walkup, Burkhard D. Steinmacher-Burow, Anamitra R. Choudhury, Yogish Sabharwal, Swati Singhal, Jeffrey J. Parker. 69 [doi]
- Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processesHari Subramoni, Sreeram Potluri, Krishna Chaitanya Kandalla, B. Barth, Jérôme Vienne, Jeff Keasler, Karen A. Tomko, Karl W. Schulz, Adam Moody, Dhabaleswar K. Panda. 70 [doi]
- Critical lock analysis: diagnosing critical section bottlenecks in multithreaded applicationsGuancheng Chen, Per Stenström. 71 [doi]
- Code generation for parallel execution of a class of irregular loops on distributed memory systemsMahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan. 72 [doi]
- First-ever full observable universe simulationJean-Michel Alimi, Vincent Bouillot, Yann Rasera, Vincent Reverdy, Pier-Stefano Corasaniti, Irène Balmès, Stéphane Requena, Xavier Delaruelle, Jean-Noel Richet. 73 [doi]
- Optimizing the computation of n-point correlations on large-scale astronomical dataWilliam B. March, Kenneth Czechowski, Marat Dukhan, Thomas Benson, Dongryeol Lee, Andrew J. Connolly, Richard W. Vuduc, Edmond Chow, Alexander G. Gray. 74 [doi]
- Hierarchical task mapping of cell-based AMR cosmology simulationsJingjin Wu, Zhiling Lan, Xuanxing Xiong, Nickolay Y. Gnedin, Andrey V. Kravtsov. 75 [doi]
- A study of DRAM failures in the fieldVilas Sridharan, Dean Liberty. 76 [doi]
- Fault prediction under the microscope: a closer look into HPC systemsAna Gainaru, Franck Cappello, Marc Snir, William Kramer. 77 [doi]
- Detection and correction of silent data corruption for large-scale high-performance computingDavid Fiala, Frank Mueller, Christian Engelmann, Rolf Riesen, Kurt B. Ferreira, Ron Brightwell. 78 [doi]
- ATLAS grid workload on NDGF resources: analysis, modeling, and workload generationDmytro Karpenko, Roman Vitenberg, Alexander L. Read. 79 [doi]
- On the effectiveness of application-aware self-management for scientific discovery in volunteer computing systemsTrilce Estrada, Michela Taufer. 80 [doi]
- On using virtual circuits for GridFTP transfersZ. Liu, Malathi Veeraraghavan, Zhenzhen Yan, Chris Tracy, Jing Tie, Ian T. Foster, J. Dennis, Jason Hick, Y. Li, W. Yang. 81 [doi]
- Dataflow-driven GPU performance projection for multi-kernel transformationsJiayuan Meng, Vitali A. Morozov, Venkatram Vishwanath, Kalyan Kumaran. 82 [doi]
- A practical method for estimating performance degradation on multicore processors, and its application to HPC workloadsTyler Dwyer, Alexandra Fedorova, Sergey Blagodurov, Mark Roth, Fabien Gaud, Jian Pei. 83 [doi]
- Aspen: a domain specific language for performance modelingKyle Spafford, Jeffrey S. Vetter. 84 [doi]
- Design and analysis of data management in scalable parallel scriptingZhao Zhang, Daniel S. Katz, Justin M. Wozniak, Allan Espinosa, Ian T. Foster. 85 [doi]
- Usage behavior of a large-scale scientific archiveIan F. Adams, Brian A. Madden, Joel C. Frank, Mark W. Storer, Ethan L. Miller, Gene Harano. 86 [doi]
- On distributed file tree walk of parallel file systemsJharrod Lafon, Satyajayant Misra, Jon Bringhurst. 87 [doi]
- Application data prefetching on the IBM blue gene/Q supercomputerI-Hsin Chung, Changhoan Kim, Hui-Fang Wen, Guojing Cong. 88 [doi]
- Hardware-software coherence protocol for the coexistence of caches and local memoriesLluc Alvarez, Lluís Vilanova, Marc González, Xavier Martorell, Nacho Navarro, Eduard Ayguadé. 89 [doi]
- What scientific applications can benefit from hardware transactional memory?Martin Schindewolf, Barna L. Bihari, John C. Gyllenhaal, Martin Schulz, Amy Wang, Wolfgang Karl. 90 [doi]
- A parallel two-level preconditioner for cosmic microwave background map-makingLaura Grigori, Radek Stompor, Mikolaj Szydlarski. 91 [doi]
- A massively space-time parallel N-body solverRobert Speck, Daniel Ruprecht, Rolf Krause, Matthew Emmett, M. Minion, Mathias Winkel, Paul Gibbon. 92 [doi]
- High-performance general solver for extremely large-scale semidefinite programming problemsKatsuki Fujisawa, Hitoshi Sato, Satoshi Matsuoka, Toshio Endo, Makoto Yamashita, Maho Nakata. 93 [doi]
- Extending the BT NAS parallel benchmark to exascale computingRob F. Van der Wijngaart, Srinivas Sridharan, Victor W. Lee. 94 [doi]
- NUMA-aware graph mining techniques for performance and energy efficiencyMichael R. Frasca, Kamesh Madduri, Padma Raghavan. 95 [doi]
- Optimization of geometric multigrid for emerging multi- and manycore processorsSamuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian van Straalen, Mikhail Smelyanskiy, Ann S. Almgren, Pradeep Dubey, John Shalf, Leonid Oliker. 96 [doi]
- Mapping applications with collectives over sub-communicators on torus networksAbhinav Bhatele, Todd Gamblin, Steve H. Langer, Peer-Timo Bremer, Erik W. Draeger, Bernd Hamann, Katherine E. Isaacs, Aaditya G. Landge, Joshua A. Levine, Valerio Pascucci, Martin Schulz, Charles H. Still. 97 [doi]
- Optimization principles for collective neighborhood communicationsTorsten Hoefler, Timo Schneider. 98 [doi]
- Optimizing overlay-based virtual networking through optimistic interrupts and cut-through forwardingZheng Cui, Lei Xia, Patrick G. Bridges, Peter A. Dinda, John R. Lange. 99 [doi]
- Communication avoiding and overlapping for numerical linear algebraEvangelos Georganas, Jorge González-Domínguez, Edgar Solomonik, Yili Zheng, Juan Touriño, Katherine A. Yelick. 100 [doi]
- Communication-avoiding parallel strassen: implementation and performanceBenjamin Lipshitz, Grey Ballard, James Demmel, Oded Schwartz. 101 [doi]
- Managing data-movement for effective shared-memory parallelization of out-of-core sparse solversHaim Avron, Anshul Gupta. 102 [doi]
- Cray cascade: a scalable HPC system based on a Dragonfly networkGreg Faanes, Abdulla Bataineh, Duncan Roweth, Tom Court, Edwin Froese, Robert Alverson, Tim Johnson, Joe Kopnick, Mike Higgins, James Reinhard. 103 [doi]
- N-body simulation with 20.5Gflops/W performanceJunichiro Makino, Hiroshi Daisaka. 104 [doi]
- SGI® UV2: a fused computation and data analysis machineGreg Thorson, Michael Woodacre. 105 [doi]