Abstract is missing.
- Algorithms for Block Bidiagonal Systems on Vector and Parallel ComputersMarkus Hegland, Michael R. Osborne. 1-6 [doi]
- Development of a Mathematical Subroutine Library for Fujitsu Vector Parallel ProcessorsRichard P. Brent, L. Grosz, David L. Harrar II, Markus Hegland, M. Kahn, G. Keating, G. Mercer, Ole Møller Nielsen, Michael R. Osborne, Bing Bing Zhou, M. Nakanishi. 13-20 [doi]
- The Potential of Data Value Speculation to Boost ILPJosé González, Antonio González. 21-28 [doi]
- Load Execution Latency ReductionBryan Black, Brian Mueller, Stephanie Postal, Ryan Rakvic, Noppanunt Utamaphethai, John Paul Shen. 29-36 [doi]
- A Performance Study of Out-of-order Vector Architectures and Short RegistersLuis Villa, Roger Espasa, Mateo Valero. 37-44 [doi]
- Efficient Support of Parallel Sparse Computation for Array Intrinsic Functions of Fortran 90Rong-Guey Chang, Tyng-Ruey Chuang, Jenq Kuen Lee. 45-52 [doi]
- Comparing Data Forwarding and Prefetching for Communication-induced Misses in Shared-memory MPsDavid Koufaty, Josep Torrellas. 53-60 [doi]
- An Efficient Uniform Run-time Scheme for Mixed Regular-irregular ApplicationsDhruva R. Chakrabarti, U. Nagaraj Shenoy, Alok N. Choudhary, Prithviraj Banerjee. 61-68 [doi]
- A Hyperplane Based Approach for Optimizing Spatial Locality in Loop NestsMahmut T. Kandemir, Alok N. Choudhary, U. Nagaraj Shenoy, Prithviraj Banerjee, J. Ramanujam. 69-76 [doi]
- Speculative Multithreaded ProcessorsPedro Marcuello, Antonio González, Jordi Tubella. 77-84 [doi]
- Hardware and Software Support for Speculative Execution of Sequential Binaries on a Chip-multiprocessorVenkata Krishnan, Josep Torrellas. 85-92 [doi]
- Coarse-grained Speculative Execution in Shared-memory MultiprocessorsIffat H. Kazi, David J. Lilja. 93-100 [doi]
- Multipath Execution: Opportunities and LimitsPritpal S. Ahuja, Kevin Skadron, Margaret Martonosi, Douglas W. Clark. 101-108 [doi]
- High-level Management of Communication Schedules in HPF-like LanguagesSiegfried Benkner, Piyush Mehrotra, John Van Rosendale, Hans P. Zima. 109-116 [doi]
- Problem and Machine Sensitive Communication OptimizationThomas Fahringer, Eduard Mehofer. 117-124 [doi]
- Loop Fusion in High Performance FortranGerald Roth, Ken Kennedy. 125-132 [doi]
- A General Algorithm for Tiling the Register LevelMarta Jiménez, José M. Llabería, Agustin Fernández, Enric Morancho. 133-140 [doi]
- Application Level Scheduling of Gene Sequence Comparison on MetacomputersNeil T. Spring, Richard Wolski. 141-148 [doi]
- Local Area Metacomputing for Multidisciplinary Problems: A Case Study for Fluid/Structure Coupled SimulationToshiya Kimura, Hiroshi Takemiya. 149-156 [doi]
- Exploiting Heterogeneous Parallelism in the Presence of Communication DelaysDingchao Li, Yuji Iwahori, Naohiro Ishii. 157-164 [doi]
- Optimizing and Load Balancing Metacomputing ApplicationsJörg Henrichs. 165-171 [doi]
- Predicting Parallel Applications Performance on Non-Dedicated Cluster PlatformsCosimo Anglano. 172-179 [doi]
- A User Level Program Transformation ToolFrançois Bodin, Yann Mével, Rene Quiniou. 180-187 [doi]
- The Role of Associativity and Commutativity in the Detection and Transformation of Loop-level ParallelismWilliam M. Pottenger. 188-195 [doi]
- The Infinity Lambda TestWeng-Long Chang, Chih-Ping Chu. 196-203 [doi]
- Predicated Array Data-flow Analysis for Run-time ParallelizationSungdo Moon, Mary W. Hall, Brian R. Murphy. 204-211 [doi]
- Measuring the Effectiveness of Automatic Parallelization in SUIFByoungro So, Sungdo Moon, Mary W. Hall. 212-219 [doi]
- PARADISE: An Advanced Featured Parallel File SystemMaciej Brodowicz, Olin Johnson. 220-226 [doi]
- Distributed Data Structure Design for Scientific ComputationJan-Jan Wu, Pangfeng Liu. 227-234 [doi]
- Bounding on the Gain of Optimizing Data Layout in Vector ProcessorsLars Lundberg, Daniel Häggander. 235-242 [doi]
- The Design and Implementation of Zero Copy MPI Using Commodity Hardware with a High Performance NetworkFrancis O Carroll, Hiroshi Tezuka, Atsushi Hori, Yutaka Ishikawa. 243-250 [doi]
- Monitoring Shared Virtual Memory Performance on a Myrinet-based PC ClusterCheng Liao, Dongming Jiang, Liviu Iftode, Margaret Martonosi, Douglas W. Clark. 251-258 [doi]
- MBCF: A Protected and Virtualized High-Speed User-Level Memory-Based Communication FacilityTakashi Matsumoto, Kei Hiraki. 259-266 [doi]
- Highly Efficient Implementation of MPI Point-to-Point Communication Using Remote Memory OperationsOsamu Tatebe, Yuetsu Kodama, Satoshi Sekiguchi, Yoshinori Yamaguchi. 267-273 [doi]
- Evaluation of Hardware Write Propagation Support for Next-generation Shared Virtual Memory ClustersAngelos Bilas, Liviu Iftode, Jaswinder Pal Singh. 274-281 [doi]
- Techniques for Empirical Testing of Parallel Random Number GeneratorsPaul D. Coddington, Sung Hoon Ko. 282-288 [doi]
- Parallel Compiled Event Driven VHDL SimulationVenkatram Krishnaswamy, Prithviraj Banerjee. 297-304 [doi]
- Load Balanced Parallel Radix SortAndrew Sohn, Yuetsu Kodama. 305-312 [doi]
- Integer Sorting on Shared-Memory Vector Parallel ComputersKenji Suehiro, Hitoshi Murai, Yoshiki Seo. 313-320 [doi]
- Speculative Execution Model with DuplicationKei Hiraki, Junji Tamatsukuri, Takashi Matsumoto. 321-328 [doi]
- Dependence Driven Execution for Multiprogrammed MultiprocessorSuvas Vajracharya, Dirk Grunwald. 329-336 [doi]
- Kernel-level Scheduling for the Nano-threads Programming ModelEleftherios D. Polychronopoulos, Xavier Martorell, Dimitrios S. Nikolopoulos, Jesús Labarta, Theodore S. Papatheodorou, Nacho Navarro. 337-344 [doi]
- Scalable On-the-fly Detection of the First Races in Parallel ProgramsJeong-Si Kim, Yong-Kee Jun. 345-352 [doi]
- Eliminating Conflict Misses for High Performance ArchitecturesGabriel Rivera, Chau-Wen Tseng. 353-360 [doi]
- Prefetching on the Cray-T3EMatthias M. Müller, Thomas M. Warschko, Walter F. Tichy. 361-368 [doi]
- Characterization and Improvement of Load/Store Cache-based PrefetchingPablo Ibáñez, Víctor Viñals, José Luis Briz, María Jesús Garzarán. 369-376 [doi]
- Hardware-driven Prefetching for Pointer Data ReferencesChi-Hung Chi, Chin-Ming Cheung. 377-384 [doi]
- Data Prefetching for Software DSMsRicardo Bianchini, Raquel Pinto, Claudio Luis de Amorim. 385-392 [doi]
- Depth Contention-free Broadcasting on Torus NetworksYomin Hou, Chien-Min Wang, Lih-Hsing Hsu. 393-400 [doi]
- OPTNET: A Cost-effective Optical Network for MultiprocessorsEnrique V. Carrera, Ricardo Bianchini. 401-408 [doi]
- Applying Segment Routing to k-ary n-cube NetworksCruz Izu, Agustin Arruabarrena. 409-416 [doi]
- Dynamic Load Balancing for Adaptive Meshes Using Symmetric Broadcast NetworksSajal K. Das, Daniel J. Harvey, Rupak Biswas. 417-424 [doi]
- Vector Architectures: Past, Present and FutureRoger Espasa, Mateo Valero, James E. Smith. 425-432 [doi]
- ::::Kin::::: A High Performance Asynchronous Processor ArchitectureRakefet Kol, Ran Ginosar. 433-440 [doi]
- Resource Widening Versus Replication: Limits and Performance-cost Trade-offDavid López, Josep Llosa, Mateo Valero, Eduard Ayguadé. 441-448 [doi]
- Utilizing Reuse Information in Data Cache ManagementJude A. Rivers, Edward S. Tam, Gary S. Tyson, Edward S. Davidson, Matthew K. Farrens. 449-456 [doi]
- A Study of Three Dynamic Approaches to Handle Widely Shared Data in Shared-memory MultiprocessorsStefanos Kaxiras, Stein Gjessing, James R. Goodman. 457-464 [doi]