Abstract is missing.
- Automatic Partitioning of Parallel Loops for Cache-Coherent MultiprocessorsAnant Agarwal, David A. Kranz, Venkat Natarajan. 2-11
- Using Synthetic-Perturbation Techniques for Tuning Shared Memory Programs (Extended Abstract)Robert Snelick, Joseph JáJá, Raghu Kacker, Gordon Lyon. 2-10
- Space-Time Representation of Iterative Algorithms and The Design of Regular Processor ArraysEfstathios D. Kyriakis-Bitzaros, Odysseas G. Koufopavlou, Constantinos E. Goutis. 2-9
- On the Parallel Diagonal Dominant AlgorithmXian-He Sun. 10-17
- Comparing Data-Parallel and Message-Passing ParadigmsAlexander C. Klaiber, James L. Frankel. 11-20
- Techniques to Enhance Cache Performance Across Parallel Program SectionsJih-Kwon Peir, Kimming So, Ju-Ho Tang. 12-19
- Supernodal Sparse Cholesky Facotrization on Distributed-Memory MultiprocessorsKalluri Eswar, P. Sadayappan, Chua-Huang Huang, V. Visvanathan. 18-22
- Function-Parallel Computation in a Data-Parallel EnvironmentAlex L. Cheung, Anthony P. Reeves. 21-24
- Parallel FFT Algorithms for Cache Based Shared Memory MultiprocessorsAkhilesh Kumar, Laxmi N. Bhuyan. 23-27
- Automatic Parallelization Techniques for the EM-4Lubomir Bic, Mayez A. Al-Mouhamed. 25-28
- Semi-Unified CachesNathalie Drach, André Seznec. 25-28
- An Analysis of Hashing on Parallel and Vector ComputersThomas J. Sheffler, Randal E. Bryant. 29-36
- Automating Parallelization of Regular Computations for Distributed-MemoryErnesto Su, Daniel J. Palermo, Prithviraj Banerjee. 30-38
- Dependence Analysis and Architecture Design for Bit-Level AlgorithmsWeijia Shang, Benjamin W. Wah. 30-38
- Multiple Quadratic Forms: A Case Study in the Design of Scalable AlgorithmsMu-Cheng Wang, Wayne G. Nation, James B. Armstrong, Howard Jay Siegel, Shin-Dug Kim, Mark A. Nichols, Michael Gherrity. 37-46
- Compilation Techiques for Optimizing Communication on Distributed-Memory SystemsChun Gong, Rajiv Gupta, Rami G. Melhem. 39-46
- ATOMIC: A Low-Cost, Very-High-Speed, Local Communication ArchitectureDanny Cohen, Gregory G. Finn, Robert E. Felderman, Annette L. DeSchon. 39-46
- Meta-State ConversionHenry G. Dietz, G. Krishnamurthy. 47-56
- Reconfigurable Branch Processing Strategy in Super-Scalar MicroprocessorsTerence M. Potter, Hsiao-chen Chung, Chuan-lin Wu. 47-50
- Data-Parallel R-Tree AlgorithmsErik G. Hoel, Hanan Samet. 47-50
- Exploiting Spatial and Temporal Parallelism in the Multithreaded Node Architecture Implemented on Superscalar RISC ProcessorsDaejoon Hwang, Seung Ho Cho, Y. D. Kim, Sangyong Han. 51-54
- Time Parallel Algorihts for Solution of Linear Parabolic PDEsAmir Fijany. 51-55
- Fixed and Adaptive Sequential Prefetching in Shared Memory MultiprocessorsFredrik Dahlgren, Michel Dubois, Per Stenström. 56-63
- Computing Connected Components and Some Related Applications on a RAPTzong-Wann Kao, Shi-Jinn Horng, Horng-Ren Tsai. 57-64
- A Unified Model for Concurrent DebuggingS. Irfan Hyder, John Werth, James C. Browne. 58-67
- Assigning Sites fto Redundant Clusters in a Distributed Storage SystemAntoine N. Mourad, W. Kent Fuchs, Daniel G. Saab. 64-71
- PARSA: A Parallel Program Scheduling and Assessment EnvironmentBehrooz Shirazi, Krishna M. Kavi, Ali R. Hurson, Prasenjit Biswas. 68-72
- Balanced Distributed Memory Parallel ComputersFranck Cappello, Jean-Luc Béchennec, Franck Delaplace, Cécile Germain, Jean-Louis Giavitto, Vincent Néri, Daniel Etiemble. 72-76
- VSTA: A Prolog-Based Formal Verifier for Systolic Array DesignsNam Ling, Timothy K. Shih. 73-81
- On Embeddings of Rectangles into Optimal SquaresShou-Hsuan Stephen Huang, Hongfei Liu, Rakesh M. Verma. 73-76
- Finding Articulation Points and Bridges of Permutation GraphsOscar H. Ibarra, Qi Zheng. 77-80
- A Novel Approach to the Design of Scalable Shared-Memory MultiprocessorsHonda Shing, Lionel M. Ni. 77-81
- Pattern Recognition Using FractalsDavid W. N. Sharp, R. Lyndon While. 82-89
- Decremental Scattering for Data Transport Between Host and Hypercube NodesMukesh Sharma, Meghanad D. Wagh. 82-85
- Incomplete Star Graph: An Economical Fault-tolerant Interconnection NetworkC. P. Ravikumar, A. Kuchlous, G. Manimaran. 83-90
- Memory Reference Behavior of Compiler Optimized Programs on High SpeedJohn W. C. Fu, Janak H. Patel. 87-94
- Efficient Image Processing Algorithms on the Scan Line Array ProcessorDavid R. Helman, Joseph JáJá. 90-93
- The Star Connected Cycles: A Fixed-Degree Network for Parallel ProcessingShahram Latifi, Marcelo M. de Azevedo, Nader Bagherzadeh. 91-95
- Iteration Partitioning for Resolving Stride Conflicts on Cache-Coherent MultiprocessorsKaren A. Tomko, Santosh G. Abraham. 95-102
- Empirical Evaluation of Incomplete Hypercube SystemsNian-Feng Tzeng. 96-99
- O(n)-Time and O(log n)-Space Image Component Labeling with Local Operators on SIMD Mesh Connected ComputersHongchi Shi, Gerhard X. Ritter. 98-101
- A Distributed Multicast Algorithm for Hypercube MulticomputersJyh-Charn Liu, Hung-Ju Lee. 100-104
- Solving the Region Growing Problem on the Connection MachineNawal Copty, Sanjay Ranka, Geoffrey Fox, Ravi V. Shankar. 102-105
- Performance and Scalability Aspects of Directory-Based Cache Coherence in Shared-Memory MultiprocessorsSilvio Picano, David G. Meyer, Eugene D. Brooks III, Joseph E. Hoag. 103-106
- A Generalized Bitonic Sorting NetworkKathy J. Liszka, Kenneth E. Batcher. 105-108
- Minimum Completion Time Criterion for Parallel Sparse Cholesky FactorizationWen-Yang Lin, Chuen-Liang Chen. 107-114
- Compiling for Hierarchical Shared Memory MultiprocessorsJeff D. Martens, D. N. Jayasimha. 107-110
- A Lazy Scheduling Scheme for Improving Hypercube PerformancePrasant Mohapatra, Chansu Yu, Chita R. Das, Jong Kim. 110-117
- Efficient Use of Dynamically Tagged Directories Through Compiler AnalysisTrung N. Nguyen, Zhiyuan Li, David J. Lilja. 112-119
- Scalability of Parallel Algorithms for Matrix MultiplicationAnshul Gupta, Vipin Kumar. 115-123
- Fast and Efficient Strategies for Cubic and Non-Cubic Allocation in Hypercube MultiprocessorsDebendra Das Sharma, Dhiraj K. Pradhan. 118-127
- Trailblazing: A Hierarchical Approach to Percolation SchedulingAlexandru Nicolau, Steven Novack. 120-124
- Generalised Matrix Inversion by Successive Matrix SquaringLujuan Chen, E. V. Krishnamurthy, Iain MacLeod. 124-127
- Contention-Free 2D-Mesh Cluster Allocation in HypercubesStephen W. Turner, Lionel M. Ni, Betty H. C. Cheng. 125-129
- Parallel Computation of the Singular Value Decomposition on Tree ArchitecturesBing Bing Zhou, Richard P. Brent. 128-131
- A Task Allocation Algorithm in a Multiprocessor Real-Time SystemJean-Pierre Beauvais, Anne-Marie Déplanche. 130-133
- Fault Tolerant Subcube Allocation in HypercubesYeimkuan Chang, Laxmi N. Bhuyan. 132-136
- A Fault-Tolerant Parallel Algorithm for Iterative Solution of the Laplace EquationAmber Roy-Chowdhury, Prithviraj Banerjee. 133-140
- Processor Allocation and Scheduling of Macro Dataflow Graphs on Distributed Memory Multicomputers by the PARADIGM CompilerShankar Ramaswamy, Prithviraj Banerjee. 134-138
- Performance of Redundant Disk Array Organizations in Transaction Processing EnvironmentsAntoine N. Mourad, W. Kent Fuchs, Daniel G. Saab. 138-145
- Locality and Loop Scheduling on NUMA MultiprocessorsHui Li, Sudarsan Tandri, Michael Stumm, Kenneth C. Sevcik. 140-147
- Emulating Reconfigurable Arrays for Image Processing Using the MasPar ArchitectureJosé Salinas, Fabrizio Lombardi. 141-148
- Compile-Time Characterization of Recurrent Patterns in Irregular ComputationsKalluri Eswar, P. Sadayappan, Chua-Huang Huang. 148-155
- Ring Embedding in an Injured HypercubeYu-Chee Tseng, Ten-Hwang Lai. 149-152
- An Adaptive System-Level Diagnosis Approach for Mesh Connected MultiprocessorsChao Feng, Laxmi N. Bhuyan, Fabrizio Lombardi. 153-157
- A Scalable Optical Interconnection Network for Fine-Grain Parallel ArchitecturesD. Scott Wills, Matthias Grossglauser. 154-157
- Investigating Properties of Code TransformationsDeborah Whitfield, Mary Lou Soffa. 156-160
- Bus-Based Tree Structures for Efficient Parallel ComputationOmkar M. Dighe, Ramachandran Vaidyanathan, Si-Qing Zheng. 158-161
- Fast Parallle Algorithms for Routing One-To-One Assignments in Benes NetworksChing-Yi Lee, A. Yavuz Oruç. 159-166
- Efficient Stack Simulation for Shared Memory Set-Associative Multiprocessor CachesChing-Farn Eric Wu, Yarsun Hsu, Yew-Huey Liu. 163-170
- Solving Dynamic and Irregular Problems on SIMD Architectures with Runtime SupportWei Shu, Min-You Wu. 167-174
- Optimal Routing Algorithms for Generalized de Bruijn DigraphsGuoping Liu, Kyungsook Y. Lee. 167-174
- Parallel Cache Simulation on Multiprocessor WorkstationsLuis Barriga, Rassul Ayani. 171-174
- A Class of Partially Adaptive Routing Algorithms for n_dimensional MeshesYounes M. Boura, Chita R. Das. 175-182
- Evaluation of Data Distirbution Patterns in Distributed-Memory MachinesEdgar T. Kalns, Hong Xu, Lionel M. Ni. 175-183
- Evaluating the Impact of Cache Interferences on Numerical CodesOlivier Temam, Christine Fricker, William Jalby. 180-183
- Activity Counter: New Optimization for the Dynamic Scheduling of SIMD Control FlowRonan Keryell, Nicolas Paris. 184-187
- Performance Evaluation of Memory Caches in MultiprocessorsYung-Chin Chen, Alexander V. Veidenbaum. 184-187
- Generation of Long Sorted Runs on a Unidirectional ArrayYen-Chun Lin, Horng-Yi Lai. 184-191
- SIMD Optimizations in a Data Parallel CMaya Gokhale, Phil Pfeiffer. 188-191
- Transmission Times in Buffered Full-Crossbar Communication Networks With Cyclic ArbitrationA. J. Field, Peter G. Harrison. 189-196
- Time- and VLSI-Optimal Sorting on Meshes with Multiple BroadcastingDharmavani Bhagavathi, Himabindu Gurla, Stephan Olariu, James L. Schwing, W. Shen, Larry Wilson, Jingyuan Zhang. 192-195
- An Adaptive Submesh Allocation Strategy For Two-Dimensional Mesh Connected SystemsJianxun Ding, Laxmi N. Bhuyan. 193-200
- A Comparison Based Parallel Sorting AlgorithmLaxmikant V. Kalé, Sanjeev Krishnan. 196-200
- Experimental Validation of a Performance Model for Simple Layered Task SystemsAthar B. Tayyab, Jon G. Kuhl. 197-201
- SnakeSort: A Family of Simle Optimal Randomized Sorting AlgorithmsDavid T. Blackston, Abhiram G. Ranade. 201-204
- Performance Evaluation of SIMD Processor Architectures Using Pairwise Multiplier RecodingTodd C. Marek, Edward W. Davis. 202-205
- Experiments with Configurable Locks for MultiprocessorsBodhisattwa Mukherjee, Karsten Schwan. 205-208
- Merging Multiple Lists in O(log n) TimeZhaofang Wen. 205-208
- Performance Considerations Relating to the Design of Interconnection Networks for Multiprocessing SystemsEarl Hokens, Ahmed Louri. 206-209
- On the Bit-Level Complexity of Bitonic Sorting NetworksMajed Z. Al-Hajery, Kenneth E. Batcher. 209-213
- A Queuing Model for Finite-Buffered Multistage Interconnection NetworksPrasant Mohapatra, Chita R. Das. 210-213
- A Distributed Load Balancing Scheme for Data Parallel ApplicationsWalid R. Tout, Sakti Pramanik. 213-216
- Composite Performance and Reliability Analysis for Hypercube SystemsSamir M. Koriem, Lalit M. Patnaik. 214-217
- Multicoloring for Fast Sparse Matrix-Vector Multiplication in Solving PDE ProblemsHwang-Cheng Wang, Kai Hwang. 215-222
- Would You Run It Here...Or There? (AHS: Automatic Heterogeneous Supercomputing)Henry G. Dietz, William E. Cohen, B. K. Grant. 217-221
- Estimation of Execution times on Heterogeneous Supercomputer ArchitecturesJaehyung Yang, Ishfaq Ahmad, Arif Ghafoor. 219-226
- Efficient Parallel Shortest Path Algorithms for Banded MatricesYijie Han, Yoshihide Igarashi. 223-226
- A Concurrent Dynamic Task GraphTheodore Johnson. 223-230
- Parallel Implementations of a Scalable Consistent Labeling Technique on Distributed Memory Multi-Processor SystemsWei-Ming Lin, Zhenhong Lu. 227-230
- Adaptive Deadlock-Free Routing in Multicomputers Using Only One Extra Virtual ChannelChien-Chun Su, Kang G. Shin. 227-231
- Maximally Fault Tolerant Directed Network Graph With Sublogarithmic Diameter For Arbitrary Number of NodesPradip K. Srimani. 231-234
- Unified Static Scheduling on Various ModelsLiang-Fang Chao, Edwin Hsing-Mean Sha. 231-235
- A Hybrid Shared Memory/Message Passing Parallel MachineMatthew Frank, Mary K. Vernon. 232-236
- Fast Arithmetic on Reconfigurable MeshesHeonchul Park, Viktor K. Prasanna, Ju-wook Jang. 236-243
- Dataflow Graph Optimization for Dataflow Architectures - A Dataflow Optimizing CompilerSholin Kyo, Shin ichiro Okazaki, Masanori Mizoguchi. 236-240
- Scalability Study of the KSR-1Umakishore Ramachandran, Gautam Shah, Ravi Kumar, Jeyakumar Muthukumarasamy. 237-240
- Increasing Instruction-level Parallelism through Multi-way BranchingSoo-Mook Moon. 241-245
- Personalized Communication Avoiding Node Contention on Distributed Memory SystemsSanjay Ranka, Jhy-Chun Wang, Manoj Kumar. 241-244
- Design of Algorithm-Based Fault Tolerant Systems With In-System ChecksShalini Yajnik, Niraj K. Jha. 246-253
- Optimizing Parallel Programs Using Affinity RegionsWilliam F. Appelbe, Balakrishnan Lakshmanan. 246-249
- On the Practical Application of a Quantitative Model of System Reconfiguration Due to a FaultGene Saghi, Howard Jay Siegel, José A. B. Fortes. 248-252
- A Model for Automatic Dta PartitioningPaul D. Hovland, Lionel M. Ni. 251-259
- A Cache Coherence Protocol for MIN-Based Multprocessors With Limited InclusionMazin S. Yousif, Chita R. Das, Matthew Thazhuthaveetil. 254-257
- Impact of Memory Contention on Dynamic Scheduling on NUMA MultiprocessorsDannie Durand, Thierry Montaut, Lionel Kervella, William Jalby. 258-262
- A Parallel Scheduling Method for Efficient Query ProcessingAbdelkader Hameurlain, Franck Morvan. 258-262
- Multi-Level Communication Structure for Hierarchical Grain AggregationHelen Gill, Thomas J. Smith, Thomas E. Gerasch, Carolyn McCreary. 260-264
- Automated Learning of Workload Measures for Load Balancing on a Distributed SystemPankaj Mehra, Benjamin W. Wah. 263-270
- Reliability Evalutaion of Disk Array ArchitecturesJohn A. Chandy, Prithviraj Banerjee. 263-267
- Dependence-Based Complexity Metrics for Distributed ProgramsJingde Cheng. 265-268
- Prime-Way Interleaved MemoryDe-Lei Lee. 268-272
- Automatically Mapping Sequential Objects to Concurrent Objects: The Mutual Exclusion ProblemDavid L. Sims, Debra A. Hensgen. 269-272
- Allocation of Parallel Programs With Time Variant Resource RequirementsJohn D. Evans, Robert R. Kessler. 271-275
- Communication-Free Data Allocation Techniques for Parallelizing Compilers on MulticomputersTzung-Shi Chen, Jang-Ping Sheu. 273-277
- Reducing the Effect of Hot Spots by Using a Multipath NetworkMu-Cheng Wang, Howard Jay Siegel, Mark A. Nichols, Seth Abraham. 274-281
- Impact of Data Placement on Parallel I/O SystemsJ. Bartlett Sinclair, J. Tang, Peter J. Varman, Balakrishna R. Iyer. 276-279
- Improving RAID-5 Performance by Un-striping Moderate-Sized FilesRonald K. McMurdy, Badrinath Roysam. 279-282
- Prefix Computation On a Faulty HypercubeC. S. Raghavendra, M. A. Sridhar, S. Harikumar. 280-283
- Hardware Suport for Fast Reconfigurability in Processor ArraysMassimo Maresca, Hungwen Li, Pierpaolo Baglietto. 282-289
- On Performance, Efficiency of VLIW and SuperscalarSoo-Mook Moon, Kemal Ebcioglu. 283-287
- Task Based Reliability for Large Systems: A Hierarchical Modeling ApproachTeresa A. Dahlberg, Dharma P. Agrawal. 284-287
- Efficient Broadcast in All-Port Wormhole-Routed HypercubesPhilip K. McKinley, Christian Trefftz. 288-291
- Performance of a Globally-Clocked Parallel SimulatorGregory D. Peterson, Roger D. Chamberlain. 289-298
- Closed Form Solutions for Bus and Tree Networks of Processors Load Sharing A Divisible JobSameer M. Bataineh, Te-Yu Hsiung, Thomas G. Robertazzi. 290-293
- Implementing Speculative Parallelism in Possible Computational WorldsDebra S. Jusak, James W. Hearne, Hilda Halliday. 292-296
- The Message Flow Model for Routing in Wormhole-Routed NetworksXiaola Lin, Philip K. McKinley, Lionel M. Ni. 294-297
- Fast Enumeration of Solutions for Data Dependence Analysis and Data Locality OptimizationChristine Eisenbeis, Olivier Temam, Harry A. G. Wijshoff. 299-306
- Generalized Fibonacci CubesW.-J. Hsu, M. J. Chung. 299-302
- On Compiling Array Expressions for Efficient Execution on Distributed-Memory MachinesSandeep K. S. Gupta, S. D. Kaushik, S. Mufti, Sanjay Sharma, Chua-Huang Huang, P. Sadayappan. 301-305
- HMIN: A New Method for Hierarchical Interconnection of ProcessorsYashovardhan R. Potlapalli, Dharma P. Agrawal. 303-306
- Tightly Connected Hierarchical Interconnection Networks for Parallel ProcessorsPeter Thomas Breznay, Mario Alberto López. 307-310
- Square Meshes Are Not Optimal For Convex Hull ComputationDharmavani Bhagavathi, Himabindu Gurla, Stephan Olariu, Rong Lin, James L. Schwing, Jingyuan Zhang. 307-310
- The Folded Petersen Network: A New Communication-Efficient Multiprocessor TopologySabine R. Öhring, Sajal K. Das. 311-314
- Embedding Large Mesh of Trees and Related Networks in the Hypercube With Load BalancingKemal Efe. 311-315
- Hierarchical WK-Recursive Topologies for Multicomputer SystemsRonald Fernandes, Arkady Kanevsky. 315-318
- Substructure Allocation in Recursive Interconnection NetworksRonald Fernandes, Arkady Kanevsky. 319-322
- Coherence, Synchronization and State-sharing in Distributed Shared-memory ApplicationsR. Ananthanarayanan, Mustaque Ahamad, Richard J. LeBlanc. 324-331
- A Characterization of Scalable Shared MemoriesPrince Kohli, Gil Neiger, Mustaque Ahamad. 332-335
- P:::3:::M: A Virtual Machine Approach to Massively Parallel ComputingFabrizio Baiardi, Mehdi Jazayeri. 340-344
- Pipeline Processing of Multi-Way Join Queries in Shared-Memory SystemsKian-Lee Tan, Hongjun Lu. 345-348
- Panel: In Search of a Universal (But Useful) Model of a Parallel ComputationHoward Jay Siegel. 349-350