Abstract is missing.
- First-principles calculations of electron states of a silicon nanowire with 100, 000 atoms on the K computerYukihiro Hasegawa, Jun-ichi Iwata, Miwako Tsuji, Daisuke Takahashi, Atsushi Oshiyama, Kazuo Minami, Taisuke Boku, Fumiyoshi Shoji, Atsuya Uno, Motoyoshi Kurokawa, Hikaru Inoue, Ikuo Miyoshi, Mitsuo Yokokawa. 1 [doi]
- Atomistic nanoelectronic device engineering with sustained performances up to 1.44 PFlop/sMathieu Luisier, Timothy B. Boykin, Gerhard Klimeck, Wolfgang Fichtner. 2 [doi]
- Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputerTakashi Shimokawabe, Takayuki Aoki, Tomohiro Takaki, Toshio Endo, Akinori Yamanaka, Naoya Maruyama, Akira Nukada, Satoshi Matsuoka. 3 [doi]
- Petaflop biofluidics simulations on a two million-core systemMassimo Bernaschi, Mauro Bisson, Toshio Endo, Satoshi Matsuoka, Massimiliano Fatica, Simone Melchionna. 4 [doi]
- A new computational paradigm in multiscale simulations: application to brain blood flowLeopold Grinberg, Joseph A. Insley, Vitali A. Morozov, Michael E. Papka, George E. Karniadakis, Dmitry A. Fedosov, Kalyan Kumaran. 5 [doi]
- Optimizing symmetric dense matrix-vector multiplication on GPUsRajib Nath, Stanimire Tomov, Tingxing Dong, Jack Dongarra. 6 [doi]
- Tiled QR factorization algorithmsHenricus Bouwmeester, Mathias Jacquelin, Julien Langou, Yves Robert. 7 [doi]
- Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernelsAzzam Haidar, Hatem Ltaief, Jack Dongarra. 8 [doi]
- Liszt: a domain specific language for building portable mesh-based PDE solversZach DeVito, Niels Joubert, Francisco Palacios, Stephen Oakley, Montserrat Medina, Mike Barrientos, Erich Elsen, Frank Ham, Alex Aiken, Karthik Duraisamy, Eric Darve, Juan Alonso, Pat Hanrahan. 9 [doi]
- Simplified parallel domain traversalWesley Kendall, Jingyuan Wang, Melissa Allen, Tom Peterka, Jian Huang, David Erickson. 10 [doi]
- Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputersNaoya Maruyama, Tatsuo Nomura, Kento Sato, Satoshi Matsuoka. 11 [doi]
- CudaDMA: optimizing GPU memory bandwidth via warp specializationMichael Bauer, Henry Cook, Brucek Khailany. 12 [doi]
- Dymaxion: optimizing memory access patterns for heterogeneous systemsShuai Che, Jeremy W. Sheaffer, Kevin Skadron. 13 [doi]
- GROPHECY: GPU performance projection from CPU code skeletonsJiayuan Meng, Vitali A. Morozov, Kalyan Kumaran, Venkatram Vishwanath, Thomas D. Uram. 14 [doi]
- Multithreaded global address space communication techniques for gyrokinetic fusion applications on ultra-scale platformsRobert Preissl, Nathan Wichmann, Bill Long, John Shalf, Stéphane Ethier, Alice E. Koniges. 15 [doi]
- Parallel random numbers: as easy as 1, 2, 3John K. Salmon, Mark A. Moraes, Ron O. Dror, David E. Shaw. 16 [doi]
- Server-side I/O coordination for parallel file systemsHuaiming Song, Yanlong Yin, Xian-He Sun, Rajeev Thakur, Samuel Lang. 17 [doi]
- QoS support for end users of I/O-intensive applications using shared storage systemsXuechen Zhang, Kei Davis, Song Jiang. 18 [doi]
- Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systemsVenkatram Vishwanath, Mark Hereld, Vitali A. Morozov, Michael E. Papka. 19 [doi]
- GreenSlot: scheduling energy consumption in green datacentersIñigo Goiri, Ryan Beauchea, Kien Le, Thu D. Nguyen, Md. E. Haque, Jordi Guitart, Jordi Torres, Ricardo Bianchini. 20 [doi]
- A 'cool' load balancer for parallel applicationsOsman Sarood, Laxmikant V. Kalé. 21 [doi]
- Reducing electricity cost through virtual machine placement in high performance computing cloudsKien Le, Ricardo Bianchini, Jingru Zhang, Yogesh Jaluria, Jiandong Meng, Thu D. Nguyen. 22 [doi]
- Gyrokinetic toroidal simulations on leading multi- and manycore HPC systemsKamesh Madduri, Khaled Z. Ibrahim, Samuel Williams, Eun-Jin Im, Stéphane Ethier, John Shalf, Leonid Oliker. 23 [doi]
- Unitary qubit lattice simulations of multiscale phenomena in quantum turbulenceGeorge Vahala, Min Soe, Bo Zhang, Jeffrey Yepez, Linda Vahala, Jonathan Carter, Sean Ziegeler. 24 [doi]
- An image compositing solution at scaleKenneth Moreland, Wesley Kendall, Tom Peterka, Jian Huang. 25 [doi]
- The IBM Blue Gene/Q interconnection network and message unitDong Chen, Noel Eisley, Philip Heidelberger, Robert M. Senger, Yutaka Sugawara, Sameer Kumar 0001, Valentina Salapura, David L. Satterfield, Burkhard D. Steinmacher-Burow, Jeffrey J. Parker. 26 [doi]
- High-efficiency server designEitan Frachtenberg, Ali Heydari, Harry Li, Amir Michael, Jacob Na, Avery Nisbet, Pierluigi Sarti. 27 [doi]
- Using the TOP500 to trace and project technology and architecture trendsPeter M. Kogge, Timothy J. Dysart. 28 [doi]
- I/O streaming evaluation of batch queries for data-intensive computational turbulenceKalin Kanov, Eric A. Perlman, Randal C. Burns, Yanif Ahmad, Alexander S. Szalay. 29 [doi]
- Parallel index and query for large scale data analysisJerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel, Prabhat, Robert D. Ryne. 30 [doi]
- ISABELA-QA: query-driven analytics with ISABELA-compressed extreme-scale scientific dataSriram Lakshminarasimhan, John Jenkins, Isha Arkatkar, Zhenhuan Gong, Hemanth Kolla, Seung-Hoe Ku, Stéphane Ethier, Jackie Chen, Choong-Seock Chang, Scott Klasky, Robert Latham, Robert B. Ross, Nagiza F. Samatova. 31 [doi]
- FTI: high performance fault tolerance interface for hybrid systemsLeonardo Arturo Bautista Gomez, Seiji Tsuboi, Dimitri Komatitsch, Franck Cappello, Naoya Maruyama, Satoshi Matsuoka. 32 [doi]
- Checkpointing strategies for parallel jobsMarin Bougeret, Henri Casanova, Mikaël Rabie, Yves Robert, Frédéric Vivien. 33 [doi]
- BlobCR: efficient checkpoint-restart for HPC applications on IaaS clouds using virtual disk image snapshotsBogdan Nicolae, Franck Cappello. 34 [doi]
- Fast implementation of DGEMM on Fermi GPUGuangming Tan, Linchuan Li, Sean Triechle, Everett Phillips, Yungang Bao, Ninghui Sun. 35 [doi]
- Scalable fast multipole methods on distributed heterogeneous architecturesQi Hu, Nail A. Gumerov, Ramani Duraiswami. 36 [doi]
- Multi-science applications with single codebase - GAMER - for massively parallel architecturesHemant Shukla, Hsi-Yu Schive, Tak-Pong Woo, Tzihong Chiueh. 37 [doi]
- Virtual I/O caching: dynamic storage cache management for concurrent workloadsMichael R. Frasca, Ramya Prabhakar, Padma Raghavan, Mahmut T. Kandemir. 38 [doi]
- SCMFS: a file system for storage class memoryXiaoJian Wu, A. L. Narasimha Reddy. 39 [doi]
- Optimized pre-copy live migration for memory intensive applicationsKhaled Z. Ibrahim, Steven A. Hofmeyr, Costin Iancu, Eric Roman. 40 [doi]
- Scalable hashing for shared memory supercomputersEric L. Goodman, M. Nicole Lemaster, Edward Jimenez. 41 [doi]
- An early performance analysis of POWER7-IH HPC systemsKevin J. Barker, Adolfy Hoisie, Darren J. Kerbyson. 42 [doi]
- A similarity measure for time, frequency, and dependencies in large-scale workloadsMario Lassnig, Thomas Fahringer, Vincent Garonne, Angelos Molfetas, Martin Barisits. 43 [doi]
- Evaluating the viability of process replication reliability for exascale systemsKurt B. Ferreira, Jon Stearley, James H. Laros III, Ron Oldfield, Kevin T. Pedretti, Ron Brightwell, Rolf Riesen, Patrick G. Bridges, Dorian Arnold. 44 [doi]
- Modeling and tolerating heterogeneous failures in large parallel systemsEric Martin Heien, Derrick Kondo, Ana Gainaru, Dan LaPine, Bill Kramer, Franck Cappello. 45 [doi]
- System implications of memory reliability in exascale computingSheng Li, Ke Chen, Ming Yu Hsieh, Naveen Muralimanohar, Chad D. Kersey, Jay B. Brockman, Arun F. Rodrigues, Norman P. Jouppi. 46 [doi]
- TRACON: interference-aware scheduling for data-intensive applications in virtualized environmentsRon Chi-Lung Chiang, H. Howie Huang. 47 [doi]
- Flexible resource allocation for reliable virtual cluster computing systemsThomas J. Hacker, Kanak Mahadik. 48 [doi]
- Auto-scaling to minimize cost and meet application deadlines in cloud workflowsMing Mao, Marty Humphrey. 49 [doi]
- Large scale debugging of parallel tasks with AutomaDeDIgnacio Laguna, Todd Gamblin, Bronis R. de Supinski, Saurabh Bagchi, Greg Bronevetsky, Dong H. Ahn, Martin Schulz, Barry Rountree. 50 [doi]
- Efficient data race detection for distributed memory parallel programsChang-Seo Park, Koushik Sen, Paul Hargrove, Costin Iancu. 51 [doi]
- Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulationTrevor E. Carlson, Wim Heirman, Lieven Eeckhout. 52 [doi]
- MAximum Multicore POwer (MAMPO): an automatic multithreaded synthetic power virus generation framework for multicore systemsKarthik Ganesan, Lizy K. John. 53 [doi]
- Performance of the community earth system modelPatrick H. Worley, Arthur A. Mirin, Anthony P. Craig, Mark A. Taylor, John M. Dennis, Mariana Vertenstein. 54 [doi]
- Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuningSamuel Williams, Leonid Oliker, Jonathan Carter, John Shalf. 55 [doi]
- ab initio genomic motif identificationBenoit Marchand, Vladimir B. Bajic, Dinesh K. Kaushik. 56 [doi]
- Hadoop acceleration through network levitated mergeYandong Wang, Xinyu Que, Weikuan Yu, Dror Goldenberg, Dhiraj Sehgal. 57 [doi]
- Purlieus: locality-aware resource allocation for MapReduce in a cloudBalaji Palanisamy, Aameek Singh, Ling Liu, Bhushan Jain. 58 [doi]
- A distributed look-up architecture for text mining applications using MapReduceAtilla Soner Balkir, Ian T. Foster, Andrey Rzhetsky. 59 [doi]
- Copernicus: a new paradigm for parallel adaptive molecular dynamicsSander Pronk, Per Larsson, Iman Pouya, Gregory R. Bowman, Imran S. Haque, Kyle Beauchamp, Berk Hess, Vijay S. Pande, Peter M. Kasson, Erik Lindahl. 60 [doi]
- Enabling and scaling biomolecular simulations of 100 million atoms on petascale machines with a multicore-optimized message-driven runtimeChao Mei, Yanhua Sun, Gengbin Zheng, Eric J. Bohm, Laxmikant V. Kalé, James C. Phillips, Chris Harrison. 61 [doi]
- Parallelization design on multi-core platforms in density matrix renormalization group toward 2-D quantum strongly-correlated systemsSusumu Yamada, Toshiyuki Imamura, Masahiko Machida. 62 [doi]
- A scalable eigensolver for large scale-free graphs using 2D graph partitioningAndy Yoo, Allison H. Baker, Roger A. Pearce, Henson Van Emden. 63 [doi]
- Scalable stochastic optimization of complex energy systemsMiles Lubin, Cosmin G. Petra, Mihai Anitescu, Victor M. Zavala. 64 [doi]
- Parallel breadth-first search on distributed memory systemsAydin Buluç, Kamesh Madduri. 65 [doi]
- SciHadoop: array-based query processing in HadoopJoe B. Buck, Noah Watkins, Jeff LeFevre, Kleoni Ioannidou, Carlos Maltzahn, Neoklis Polyzotis, Scott A. Brandt. 66 [doi]
- On the duality of data-intensive file system design: reconciling HDFS and PVFSWittawat Tantisiriroj, Seung Woo Son, Swapnil Patil, Samuel Lang, Garth Gibson, Robert B. Ross. 67 [doi]
- End-to-end network QoS via scheduling of flexible resource reservation requestsSushant Sharma, Dimitrios Katramatos, Dantong Yu. 68 [doi]
- High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approachMikhail Smelyanskiy, Karthikeyan Vaidyanathan, Jee Choi, Bálint Joó, Jatin Chhugani, Michael A. Clark, Pradeep Dubey. 69 [doi]
- Scaling lattice QCD beyond 100 GPUsRonald Babich, Michael A. Clark, Bálint Joó, G. Shi, Richard C. Brower, S. Gottlieb. 70 [doi]
- Large scale plane wave pseudopotential density functional theory calculations on GPU clustersLong Wang, Yue Wu, Weile Jia, Weiguo Gao, Xuebin Chi, Lin-Wang Wang. 71 [doi]
- Scalable implementations of accurate excited-state coupled cluster theories: application of high-level methods to porphyrin-based systemsKarol Kowalski, Sriram Krishnamoorthy, Ryan M. Olson, Vinod Tipparaju, Edoardo Aprà. 72 [doi]
- Hardware/software co-design for energy-efficient seismic modelingJens Krueger, David Donofrio, John Shalf, Marghoob Mohiyuddin, Samuel Williams, Leonid Oliker, Franz-Josef Pfreund. 73 [doi]
- A fast solver for modeling the evolution of virus populationsGerhard Niederbrucker, Wilfried N. Gansterer. 74 [doi]
- Optimizing the Barnes-Hut algorithm in UPCJunchao Zhang, Babak Behzad, Marc Snir. 75 [doi]
- Avoiding hot-spots on two-level direct networksAbhinav Bhatele, Nikhil Jain, William D. Gropp, Laxmikant V. Kalé. 76 [doi]
- Improving communication performance in dense linear algebra via topology aware collectivesEdgar Solomonik, Abhinav Bhatele, James Demmel. 77 [doi]