Abstract is missing.
- A modern high-performance processor pipelineMarc Tremblay. 1 [doi]
- Experimental evaluation of application-level checkpointing for OpenMP programsGreg Bronevetsky, Keshav Pingali, Paul Stodghill. 2-13 [doi]
- Cooperative checkpointing: a robust approach to large-scale systems reliabilityAdam J. Oliner, Larry Rudolph, Ramendra K. Sahoo. 14-23 [doi]
- On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedingsArun Kejariwal, Xinmin Tian, Wei Li, Milind Girkar, Sergey Kozhukhov, Hideki Saito, Utpal Banerjee, Alexandru Nicolau, Alexander V. Veidenbaum, Constantine D. Polychronopoulos. 24 [doi]
- BranchTap: improving performance with very few checkpoints through adaptive speculation controlPatrick Akl, Andreas Moshovos. 36-45 [doi]
- Selective predicate prediction for out-of-order processorsEduardo Quiñones, Joan-Manuel Parcerisa, Antonio González. 46-54 [doi]
- Wide and efficient trace prediction using the local trace predictorJuan C. Moure, Domingo Benitez, Dolores Rexachs, Emilio Luque. 55-65 [doi]
- Scientific applications vs. SPEC-FP: a comparison of program behaviorKyle Rupnow, Arun Rodrigues, Keith D. Underwood, Katherine Compton. 66-74 [doi]
- The exigency of benchmark and compiler drift: designing tomorrow s processors with yesterday s toolsJoshua J. Yi, Hans Vandierendonck, Lieven Eeckhout, David J. Lilja. 75-86 [doi]
- Accurate memory data flow modeling in statistical simulationDavy Genbrugge, Lieven Eeckhout, Koen De Bosschere. 87-96 [doi]
- Efficient remote block-level I/O over an RDMA-capable NICManolis Marazakis, Konstantinos Xinidis, Vassilis Papaefstathiou, Angelos Bilas. 97-106 [doi]
- A scalable communication layer for multi-dimensional hyper crossbar network using multiple gigabit ethernetShinji Sumimoto, Kazuichi Ooe, Kouichi Kumon, Taisuke Boku, Mitsuhisa Sato, Akira Ukawa. 107-115 [doi]
- Large files, small writes, and pNFSDean Hildebrand, Lee Ward, Peter Honeyman. 116-124 [doi]
- A case for high performance computing with virtual machinesWei Huang, Jiuxing Liu, Bülent Abali, Dhabaleswar K. Panda. 125-134 [doi]
- Implementing virtual memory in a vector processor with software restart markersMark Hampton, Krste Asanovic. 135-144 [doi]
- Multigrid and Gauss-Seidel smoothers revisited: parallelization on chip multiprocessorsDan Wallin, Henrik Löf, Erik Hagersten, Sverker Holmgren. 145-155 [doi]
- Quantum mechanical approaches to information processingSteven Prawer. 156 [doi]
- Online power-performance adaptation of multithreaded programs using hardware event-based predictionMatthew Curtis-Maury, James Dzierwa, Christos D. Antonopoulos, Dimitrios S. Nikolopoulos. 157-166 [doi]
- A scalable low power issue queue for large instruction window processorsRajesh Vivekanandham, Bharadwaj Amrutur, R. Govindarajan. 167-176 [doi]
- Design space exploration for multicore architectures: a power/performance/thermal viewMatteo Monchiero, Ramon Canal, Antonio González. 177-186 [doi]
- Design tradeoffs for tiled CMP on-chip networksJames D. Balfour, William J. Dally. 187-198 [doi]
- STAR-MPI: self tuned adaptive routines for MPI collective operationsAhmad Faraj, Xin Yuan, David K. Lowenthal. 199-208 [doi]
- Scaling MPI to short-memory MPPs such as BG/LMontse Farreras, Toni Cortes, Jesús Labarta, George Almási. 209-218 [doi]
- Scalable, fault tolerant membership for MPI tasks on HPC systemsJyothish Varma, Chao Wang, Frank Mueller, Christian Engelmann, Stephen L. Scott. 219-228 [doi]
- Coupling prefix caching and collective downloads for remote dataset accessXiaosong Ma, Vincent W. Freeh, Tao Yang, Sudharshan Vazhkudai, Tyler A. Simon, Stephen L. Scott. 229-238 [doi]
- Heterogeneous way-size cacheJaume Abella, Antonio González. 239-248 [doi]
- Profitable loop fusion and tiling using model-driven empirical searchApan Qasem, Ken Kennedy. 249-258 [doi]
- TMA: a trap-based memory architectureHåkan Zeffer, Zoran Radovic, Martin Karlsson, Erik Hagersten. 259-268 [doi]
- Scalable algorithms for global snapshots in distributed systemsRahul Garg, Vijay K. Garg, Yogish Sabharwal. 269-277 [doi]
- Feedback-directed memory disambiguation through store distance analysisChangpeng Fang, Steve Carr, Soner Önder, Zhenlin Wang. 278-287 [doi]
- Accelerator design for protein sequence HMM searchRahul P. Maddimsetty, Jeremy Buhler, Roger D. Chamberlain, Mark A. Franklin, Brandon Harris. 288-296 [doi]
- A distributed system based on web services for computational science simulationsKeshav Pingali, Paul Stodghill. 297-306 [doi]
- Accelerating sparse matrix computations via data compressionJeremiah Willcock, Andrew Lumsdaine. 307-316 [doi]
- Sensitivity analysis of knapsack-based task scheduling on the gridDaniel C. Vanderster, Nikitas J. Dimopoulos. 317-323 [doi]
- Probabilistic accuracy bounds for fault-tolerant computations that discard tasksMartin C. Rinard. 324-334 [doi]
- Violated dependence analysisNicolas Vasilache, Cédric Bastoul, Albert Cohen, Sylvain Girbal. 335-344 [doi]
- User-guided symbiotic space-sharing of real workloadsJonathan Weinberg, Allan Snavely. 345-352 [doi]
- MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclustersHu Chen, Wenguang Chen, Jian Huang, Bob Robert, H. Kuhn. 353-360 [doi]
- Lightweight lock-free synchronization methods for multithreadingArun Kejariwal, Hideki Saito, Xinmin Tian, Milind Girkar, Wei Li, Utpal Banerjee, Alexandru Nicolau, Constantine D. Polychronopoulos. 361-371 [doi]