Abstract is missing.
- Entering the petaflop era: the architecture and performance of RoadrunnerKevin J. Barker, Kei Davis, Adolfy Hoisie, Darren J. Kerbyson, Michael Lang 0003, Scott Pakin, José Carlos Sancho. 1 [doi]
- High performance discrete Fourier transforms on graphics processorsNaga K. Govindaraju, Brandon Lloyd, Yuri Dotsenko, Burton Smith, John Manferdelli. 2 [doi]
- Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocolsWei-keng Liao, Alok N. Choudhary. 3 [doi]
- Stencil computation optimization and auto-tuning on state-of-the-art multicore architecturesKaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, Leonid Oliker, David A. Patterson, John Shalf, Katherine A. Yelick. 4 [doi]
- Bandwidth intensive 3-D FFT kernel for GPUs using CUDAAkira Nukada, Yasuhiko Ogata, Toshio Endo, Satoshi Matsuoka. 5 [doi]
- Using server-to-server communication in parallel file systems to simplify consistency and improve performancePhilip H. Carns, Bradley W. Settlemyer, Walter B. Ligon III. 6 [doi]
- Scientific application-based performance comparison of SGI Altix 4700, IBM POWER5+, and SGI ICE 8200 supercomputersSubhash Saini, Dale Talcott, Dennis C. Jespersen, M. Jahed Djomehri, Haoqiang Jin, Rupak Biswas. 7 [doi]
- Adapting a message-driven parallel application to GPU-accelerated clustersJames C. Phillips, John E. Stone, Klaus Schulten. 8 [doi]
- Scaling parallel I/O performance through I/O delegate and caching systemArifa Nisar, Wei-keng Liao, Alok N. Choudhary. 9 [doi]
- Efficient management of data center resources for massively multiplayer online gamesVlad Nae, Alexandru Iosup, Stefan Podlipnig, Radu Prodan, Dick H. J. Epema, Thomas Fahringer. 10 [doi]
- Performance optimization of TCP/IP over 10 gigabit ethernet by precise instrumentationTakeshi Yoshino, Yutaka Sugawara, Katsushi Inagami, Junji Tamatsukuri, Mary Inaba, Kei Hiraki. 11 [doi]
- A multi-level parallel simulation approach to electron transport in nano-scale transistorsMathieu Luisier, Gerhard Klimeck. 12 [doi]
- Feedback-controlled resource sharing for predictable eScienceSang-Min Park, Marty Humphrey. 13 [doi]
- Wide-area performance profiling of 10GigE and InfiniBand technologiesNageswara S. V. Rao, Weikuan Yu, William R. Wing, Stephen W. Poole, Jeffrey S. Vetter. 14 [doi]
- Accelerating configuration interaction calculations for nuclear structurePhilip Sternberg, Esmond G. Ng, Chao Yang, Pieter Maris, James P. Vary, Masha Sosonkina, Hung Viet Le. 15 [doi]
- Efficient auction-based grid reservations using dynamic programmingAndrew Mutz, Richard Wolski. 16 [doi]
- Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluationT. Scogland, Pavan Balaji, Wu-chun Feng, G. Narayanaswamy. 17 [doi]
- Dendro: parallel algorithms for multigrid and AMR methods on 2: 1 balanced octreesRahul S. Sampath, Santi S. Adavani, Hari Sundar, Ilya Lashuk, George Biros. 18 [doi]
- Characterizing application sensitivity to OS interference using kernel-level noise injectionKurt B. Ferreira, Patrick G. Bridges, Ron Brightwell. 19 [doi]
- Performance prediction of large-scale parallell system and application using macro-level simulationRyutaro Susukita, Hisashige Ando, Mutsumi Aoyagi, Hiroaki Honda, Yuichi Inadomi, Koji Inoue, Shigeru Ishizuki, Yasunori Kimura, Hidemi Komatsu, Motoyoshi Kurokawa, Kazuaki Murakami, Hidetomo Shibamura, Shuji Yamamura, Yunqing Yu. 20 [doi]
- A novel domain oriented approach for scientific grid workflow compositionJun Qin, Thomas Fahringer. 21 [doi]
- Toward loosely coupled programming on petascale systemsIoan Raicu, Zhao Zhang, Michael Wilde, Ian T. Foster, Peter H. Beckman, Kamil Iskra, Ben Clifford. 22 [doi]
- Early evaluation of IBM BlueGene/PSadaf R. Alam, Richard F. Barrett, M. Bast, Mark R. Fahey, Jeffery A. Kuehn, Collin McCurdy, J. Rogers, Philip C. Roth, Ramanan Sankaran, Jeffrey S. Vetter, Patrick H. Worley, Weikuan Yu. 23 [doi]
- Nimrod/K: towards massively parallel dynamic grid workflowsDavid Abramson, Colin Enticott, Ilkay Altintas. 24 [doi]
- SMARTMAP: operating system support for efficient data sharing among processes on a multi-core processorRon Brightwell, Kevin T. Pedretti, Trammell Hudson. 25 [doi]
- Lessons learned at 208K: towards debugging millions of coresGregory L. Lee, Dong H. Ahn, Dorian C. Arnold, Bronis R. de Supinski, Matthew Legendre, Barton P. Miller, Martin Schulz, Ben Liblit. 26 [doi]
- Applying double auctions for scheduling of workflows on the GridMarek Wieczorek, Stefan Podlipnig, Radu Prodan, Thomas Fahringer. 27 [doi]
- A novel migration-based NUCA design for chip multiprocessorsMahmut T. Kandemir, Feihui Li, Mary Jane Irwin, Seung Woo Son. 28 [doi]
- Communication avoiding Gaussian eliminationLaura Grigori, James Demmel, Hua Xiang. 29 [doi]
- Extending CC-NUMA systems to support write update optimizationsLiqun Cheng, John B. Carter. 30 [doi]
- Benchmarking GPUs to tune dense linear algebraVasily Volkov, James Demmel. 31 [doi]
- High-radix crossbar switches enabled by proximity communicationHans Eberle, Pedro Javier García, Jose Flich, José Duato, Robert Drost, Nils Gura, David Hopkins, Wladek Olesinski. 32 [doi]
- Massively parallel genomic sequence search on the Blue Gene/P architectureHeshan Lin, Pavan Balaji, Ruth Poole, Carlos P. Sosa, Xiaosong Ma, Wu-chun Feng. 33 [doi]
- The role of MPI in development time: a case studyLorin Hochstein, Forrest Shull, Lynn B. Reid. 34 [doi]
- An efficient parallel approach for identifying protein families in large-scale metagenomic data setsChangjun Wu, Ananth Kalyanaraman. 35 [doi]
- An adaptive cut-off for task parallelismAlejandro Duran, Julita Corbalán, Eduard Ayguadé. 36 [doi]
- EpiSimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networksChristopher L. Barrett, Keith R. Bisset, Stephen Eubank, Xizhou Feng, Madhav V. Marathe. 37 [doi]
- Programming the Intel 80-core network-on-a-chip terascale processorTimothy G. Mattson, Rob F. Van der Wijngaart, Michael A. Frumkin. 38 [doi]
- PAM: a novel performance/power aware meta-scheduler for multi-core systemsMohammad Banikazemi, Dan E. Poff, Bülent Abali. 39 [doi]
- Hiding I/O latency with pre-execution prefetching for parallel applicationsYong Chen, Surendra Byna, Xian-He Sun, Rajeev Thakur, William Gropp. 40 [doi]
- A dynamic scheduler for balancing HPC applicationsCarlos Boneti, Roberto Gioiosa, Francisco J. Cazorla, Mateo Valero. 41 [doi]
- Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmarkHongzhang Shan, Katie Antypas, John Shalf. 42 [doi]
- Proactive process-level live migration in HPC environmentsChao Wang, Frank Mueller, Christian Engelmann, Stephen L. Scott. 43 [doi]
- Parallel I/O prefetching using MPI file caching and I/O signaturesSurendra Byna, Yong Chen, Xian-He Sun, Rajeev Thakur, William Gropp. 44 [doi]
- BitDew: a programmable environment for large-scale data management and distributionGilles Fedak, Haiwu He, Franck Cappello. 45 [doi]
- Scalable load-balance measurement for SPMD codesTodd Gamblin, Bronis R. de Supinski, Martin Schulz, Robert J. Fowler, Daniel A. Reed. 46 [doi]
- Using overlays for efficient data transfer over shared wide-area networksGaurav Khanna 0002, Ümit V. Çatalyürek, Tahsin M. Kurç, Rajkumar Kettimuthu, P. Sadayappan, Ian T. Foster, Joel H. Saltz. 47 [doi]
- Massively parallel volume rendering using 2-3 swap image compositingHongfeng Yu, Chaoli Wang, Kwan-Liu Ma. 48 [doi]
- Capturing performance knowledge for automated analysisKevin A. Huck, Oscar Hernandez, Van Bui, Sunita Chandrasekaran, Barbara M. Chapman, Allen D. Malony, Lois C. McInnes, Boyana Norris. 49 [doi]
- The cost of doing science on the cloud: the Montage exampleEwa Deelman, Gurmeet Singh, Miron Livny, J. Bruce Berriman, John Good. 50 [doi]
- High performance multivariate visual data exploration for extremely large dataOliver Rübel, Prabhat, Kesheng Wu, Hank Childs, Jeremy Meredith, Cameron G. R. Geddes, Estelle Cormier-Michel, Sean Ahern, Gunther H. Weber, Peter Messmer, Hans Hagen, Bernd Hamann, E. Wes Bethel. 51 [doi]
- Analysis of application heartbeats: learning structural and temporal features in time series data for identification of performance problemsEmma S. Buneci, Daniel A. Reed. 52 [doi]
- Server-storage virtualization: integration and load balancing in data centersAameek Singh, Madhukar R. Korupolu, Dushmanta Mohapatra. 53 [doi]
- Materialized community ground models for large-scale earthquake simulationSteven W. Schlosser, Michael P. Ryan, Ricardo Taborda-Rios, Julio López, David R. O Hallaron, Jacobo Bielak. 54 [doi]
- Positivity, posynomials and tile size selectionLakshminarayanan Renganarayanan, Sanjay V. Rajopadhye. 55 [doi]
- A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectoriesTiankai Tu, Charles A. Rendleman, David W. Borhani, Ron O. Dror, Justin Gullingsrud, Morten Ø. Jensen, John L. Klepeis, Paul Maragakis, Patrick Miller, Kate A. Stafford, David E. Shaw. 56 [doi]
- Global trees: a framework for linked data structures on distributed memory parallel systemsD. Brian Larkins, James Dinan, Sriram Krishnamoorthy, Srinivasan Parthasarathy, Atanas Rountev, P. Sadayappan. 57 [doi]
- Parallel exact inference on the cell broadband engine processorYinglong Xia, Viktor K. Prasanna. 58 [doi]
- Prefetch throttling and data pinning for improving performance of shared cachesOzcan Ozturk, Seung Woo Son, Mahmut T. Kandemir, Mustafa Karaköy. 59 [doi]
- High-frequency simulations of global seismic wave propagation using SPECFEM3D_GLOBE on 62K processorsLaura Carrington, Dimitri Komatitsch, Michael Laurenzano, Mustafa M. Tikir, David Michéa, Nicolas Le Goff, Allan Snavely, Jeroen Tromp. 60 [doi]
- New algorithm to enable 400+ TFlop/s sustained performance in simulations of disorder effects in high-::::T::::::c:: superconductorsGonzalo Alvarez, Michael S. Summers, Don E. Maxwell, Markus Eisenbach, Jeremy S. Meredith, Jeffrey M. Larkin, John M. Levesque, Thomas A. Maier, Paul R. C. Kent, Eduardo F. D Azevedo, Thomas C. Schulthess. 61 [doi]
- Scalable adaptive mantle convection simulation on petascale supercomputersCarsten Burstedde, Omar Ghattas, Michael Gurnis, Georg Stadler, Eh Tan, Tiankai Tu, Lucas C. Wilcox, Shijie Zhong. 62 [doi]
- 0.374 Pflop/s trillion-particle kinetic modeling of laser plasma interaction on RoadrunnerKevin J. Bowers, Brian J. Albright, Benjamin Bergen, Lin Yin, Kevin J. Barker, Darren J. Kerbyson. 63 [doi]
- 369 Tflop/s molecular dynamics simulations on the Roadrunner general-purpose heterogeneous supercomputerSriram Swaminarayan, Kai Kadau, Timothy C. Germann, Gordon C. Fossum. 64 [doi]
- Linearly scaling 3D fragment method for large-scale electronic structure calculationsLin-Wang Wang, Byounghak Lee, Hongzhang Shan, Zhengji Zhao, Juan C. Meza, Erich Strohmaier, David H. Bailey. 65 [doi]