Abstract is missing.
- Analytical cache models with applications to cache partitioningG. Edward Suh, Srinivas Devadas, Larry Rudolph. 1-12 [doi]
- A synthesis of memory mechanisms for distributed architecturesJiajing Zhu, Jay Hoeflinger, David A. Padua. 13-22 [doi]
- The trade-off between implicit and explicit data distribution in shared-memory programming paradigmsDimitrios S. Nikolopoulos, Eduard Ayguadé, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta. 23-37 [doi]
- Fractal symbolic analysisNikolay Mateev, Vijay Menon, Keshav Pingali. 38-49 [doi]
- Data locality enhancement by memory reductionYonghong Song, Rong Xu, Cheng Wang, Zhiyuan Li. 50-64 [doi]
- Eliminating redundancies in sum-of-product array computationsSteven J. Deitz, Bradford L. Chamberlain, Lawrence Snyder. 65-77 [doi]
- Monotonic evolution: an alternative to induction variable substitution for dependence analysisPeng Wu, Albert Cohen, Jay Hoeflinger, David A. Padua. 78-91 [doi]
- Optimizing strategies for telescoping languages: procedure strength reduction and procedure vectorizationArun Chauhan, Ken Kennedy. 92-101 [doi]
- Loop optimization for a class of memory-constrained computationsDaniel Cociorva, J. W. Wilkins, Chi-Chung Lam, Gerald Baumgartner, J. Ramanujam, P. Sadayappan. 103-113 [doi]
- Fast parallel in-memory 64-bit sortingDaniel Jiménez-González, Juan J. Navarro, Josep-Lluis Larriba-Pey. 114-122 [doi]
- Array language support for parallel sparse computationBradford L. Chamberlain, Lawrence Snyder. 133-145 [doi]
- A parallel algorithm for sparse symbolic LU factorization without pivoting on out-of-core matricesMichel Cosnard, Laura Grigori. 146-153 [doi]
- Tools for application-oriented performance tuningJohn M. Mellor-Crummey, Robert J. Fowler, David B. Whalley. 154-165 [doi]
- Global optimization techniques for automatic parallelization of hybrid applicationsDhruva R. Chakrabarti, Prithviraj Banerjee. 166-180 [doi]
- Tuning high-performance scientific codes: the use of performance models to control resource usage during data migration and I/OJonghyun Lee, Marianne Winslett, Xiaosong Ma, Shengke Yu. 181-195 [doi]
- Computer aided hand tuning (CAHT): applying case-based reasoning to performance tuning Antoine Monsifrot, François Bodin. 196-203 [doi]
- Cache performance for multimedia applicationsNathan T. Slingerland, Alan Jay Smith. 204-217 [doi]
- On the potential of tolerant region reuse for multimedia applicationsCarlos Álvarez, Jesús Corbal, Esther Salamí, Mateo Valero. 218-228 [doi]
- Evaluation of processor code efficiency for embedded systemsMorgan Hirosuke Miki, Mamoru Sakamoto, Shingo Miyamoto, Yoshinori Takeuchi, Toyohiko Yoshida, Isao Shirakawa. 229-235 [doi]
- Bringing together automatic differentiation and OpenMPH. Martin Bücker, Bruno Lang, Dieter an Mey, Christian H. Bischof. 246-251 [doi]
- Automatic code generation for a turbulence schemePaul van der Mark, Gerard Cats, Lex Wolters. 252-259 [doi]
- Towards the effective parallel computation of matrix pseudospectraConstantine Bekas, Effrosini Kokiopoulou, Ioannis Koutis, Efstratios Gallopoulos. 260-269 [doi]
- A graphical tool for driving the parallel computation of pseudosprectraDany Mezher. 270-276 [doi]
- Register-sensitive selection, duplication, and sequencing of instructionsVivek Sarkar, Mauricio J. Serrano, Barbara B. Simons. 277-288 [doi]
- Load and store reuse using register file contentsSoner Önder, Rajiv Gupta. 289-302 [doi]
- Improving Gang Scheduling through job performance analysis and malleabilityJulita Corbalán, Xavier Martorell, Jesús Labarta. 303-311 [doi]
- Reducing the complexity of the issue logicRamon Canal, Antonio González. 312-320 [doi]
- Slice-processors: an implementation of operation-based predictionAndreas Moshovos, Dionisios N. Pnevmatikatos, Amirali Baniasadi. 321-334 [doi]
- Building a high-performance communication layer over virtual interface architecture on Linux clustersJin-Soo Kim, Kangho Kim, Sung-In Jung. 335-347 [doi]
- Integrating superscalar processor components to implement register cachingMatt Postiff, David Greene, Steven E. Raasch, Trevor N. Mudge. 348-357 [doi]
- alpha-coral: a multigrain, multithreaded processor architectureMark N. Yankelevsky, Constantine D. Polychronopoulos. 358-367 [doi]
- Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessorChong-liang Ooi, Seon Wook Kim, Il Park, Rudolf Eigenmann, Babak Falsafi, T. N. Vijaykumar. 368-380 [doi]
- Optimizing threaded MPI execution on SMP clustersHong Tang, Tao Yang. 381-392 [doi]
- Demonstrating the scalability of a molecular dynamics application on a Petaflop computerGeorge S. Almasi, Calin Cascaval, José G. Castaños, Monty Denneau, Wilm E. Donath, Maria Eleftheriou, Mark Giampapa, C. T. Howard Ho, Derek Lieber, José E. Moreira, Dennis M. Newns, Marc Snir, Henry S. Warren Jr.. 393-406 [doi]
- Computational challenges in large-scale air pollution modellingTzvetan Ostromsky, Wojciech Owczarz, Zahari Zlatev. 407-418 [doi]
- A network of cellular automata for a landslide simulationClaudia Roberta Calidonna, Claudia Di Napoli, Maurizio Giordano, Mario Mango Furnari, Salvatore Di Gregorio. 419-426 [doi]
- Improving Java performance using hardware translationRamesh Radhakrishnan, Ravi Bhargava, Lizy Kurian John. 427-439 [doi]
- A framework for efficient reuse of binary code in JavaPramod G. Joisha, Samuel P. Midkiff, Mauricio J. Serrano, Manish Gupta. 440-453 [doi]
- Algorithmic modifications to the Jacobi-Davidson parallel eigensolver to dynamically balance external CPU and memory loadRichard Tran Mills, Andreas Stathopoulos, Evgenia Smirni. 454-463 [doi]
- Workload decomposition for particle simulation applications on hierarchical distributed-shared memory parallel systems with integration of HPF and OpenMPSergio Briguglio, Beniamino Di Martino, Gregorio Vlad. 464 [doi]
- ARIMA time series modeling and forecasting for adaptive I/O prefetchingNancy Tran, Daniel A. Reed. 473-485 [doi]
- A novel renaming mechanism that boosts software prefetchingDaniel Ortega, Mateo Valero, Eduard Ayguadé. 501-510 [doi]