Abstract is missing.
- Array Operation Synthesis to Optimize HPF ProgramsGwan-Hwan Hwang, Jenq Kuen Lee, Roy Dz-Ching Ju. 1-8
- Mapping the Preconditioned Conjugate Gradient Algorithm for Neutron Diffusion Applications onto Parallel MachinesJohn John E. So, Raghunandan Janardhan, Thomas J. Downar, Howard Jay Siegel. 1-10
- Introduction to the 1996 ICPP Workshop on Challenges for Parallel ProcessingHoward Jay Siegel. 1-6
- Measuring the Performance of Parallel ComputationsJohn R. Rice. 8-9
- A Task-Based Dependability Model kor k-ary n-CubesAniruddha S. Vaidya, Byung S. Yoo, Chita R. Das. 9-16
- Polynomial-Time Nested Loop Fusion with Full ParallelismEdwin Hsing-Mean Sha, Chenhua Lang, Nelson L. Passos. 9-16
- Mechanisms for Mapping High-Level Parallel Performance DataR. Bruce Irvin, Barton P. Miller. 10-19
- A Three-Parameter Fast Givens QR Algorithm for Superscalar ProcessorsJames J. Carrig Jr., Gerard G. L. Meyer. 11-18
- Compiler Support for Privatization on Distributed-Memory MachinesDaniel J. Palermo, Ernesto Su, Eugene W. Hodges IV, Prithviraj Banerjee. 17-24
- Decomposition of Total Exchange for Multidimensional InterconnectsVassilios V. Dimakopoulos, Nikitas J. Dimopoulos. 17-21
- Wavelet Decomposition on High-Performance Computing SystemsTarek A. El-Ghazawi, Jacqueline Le Moigne. 19-23
- The Next Frontier: Interactive and Closed Loop Performance SteeringDaniel A. Reed, Christopher L. Elford, Tara M. Madhyastha, Evgenia Smirni, Stephen E. Lamm. 20-31
- Optimal Communication Algorithms for Heterogeneous Computing over ATM NetworksXiaodong Wang, Vwani P. Roychowdhury. 22-25
- On the Scalability of 2-D Wavelet Transform Algorithms on Fine-grained Parallel MachinesJamshed N. Patel, Ashfaq A. Khokhar, Leah H. Jamieson. 24-28
- On Optimal Size and Shape of Supernode TransformationsEdin Hodzic, Weijia Shang. 25-34
- Conflict Resolution in the Inside-Out Routing AlgorithmSeung-Woo Seo, Tse-Yun Feng, Yanggon Kim. 26-33
- A Study of a Non-Linear Optimization Problem Using a Distributed Genetic AlgorithmNuno Neves, Anthony-Trung Nguyen, Edgar L. Torres. 29-36
- Issues Related to the Structure and Performance of Parallel Applications in a Production EnvironmentDavid Schneider. 32-43
- Efficient Collective Operations with ATM Network Interface SupportYih Huang, Philip K. McKinley. 34-43
- A Compile Time Partitioning Method for DOALL Loops on Distributed Memory SystemsSantosh Pande. 35-44
- A Parallel Algorithm for State Assignment of Finite State MachinesGagan Hasteer, Prithviraj Banerjee. 37-45
- Contention-Free Communication Scheduling on 2D MeshesAndreas Eberhard, Jingke Li. 44-51
- Unique Sets Oriented Partitioning of Nested Loops with Non-uniform DependencesJialin Ju, Vipin Chaudhary. 45-52
- A Massively Parallel SIMD Algorithm for Combinatorial OptimizationRanjit A. Henry, Nicholas S. Flann, Daniel W. Watson. 46-49
- Implementation of a Training Set Parallel Algorithm for an Automated Fingerprint Image Comparison SystemHany H. Ammar, Zhouhui Miao. 50-53
- Commercially Viable MPP NetworksCraig B. Stunkel. 52-63
- Adaptive Routing in Irregular Networks Using Cut-Through SwitchesWenjian Qiao, Lionel M. Ni. 52-60
- Towards Automatic Performance AnalysisAmitabh Sinha, Laxmikant V. Kalé. 53-60
- An Efficient Algorithm for Row Minima Computations in Monotone MatricesKoji Nakano, Stephan Olariu. 54-61
- Estimating Parallel Execution Time of Loops with Loop-Carried DependenciesTsuneo Nakanishi, Kazuki Joe, Constantine D. Polychronopoulos, Keijiro Araki, Akira Fukuda. 61-69
- A High Performance Router Architecture for Interconnection NetworksJosé Duato, Pedro López, Federico Silla, Sudhakar Yalamanchili. 61-68
- Let Us Build System-Friendly Networks - Build Them HierarchicallyWilliam Tsun-Yuk Hsu, Pen-Chung Yew. 64-73
- Scalable S-to-P Broadcasting on Message-Passing MPPsSusanne E. Hambrusch, Ashfaq A. Khokhar, Yi Liu. 69-76
- A Novel Parallel Algorithm for Enumerating CombinationsBing Bing Zhou, Richard P. Brent, X. Qu, W. F. Liang. 70-73
- Performance Analysis and Prediction of Processor Scheduling Strategies in Multiprogrammed Shared-Memory MultiprocessorsKelvin K. Yue, David J. Lilja. 70-78
- A Time- and Cost-Optimal Algorithm for Overlap Graphs, with ApplicationsStephan Olariu, Albert Y. Zomaya. 74-81
- Issues in Designing Truly Scalable Interconnection NetworksLionel M. Ni. 74-83
- Maximum Reconfiguration of 2-D Mesh Systems with FaultsNian-Feng Tzeng, Guanghua Lin. 77-84
- The Impact of Speeding up Critical Sections with Data Prefetching and ForwardingPedro Trancoso, Josep Torrellas. 79-86
- Randomized Parallel Algorithms for the Homing Sequence ProblemBala Ravikumar, X. Xiong. 82-89
- A Multicast Protocol Based on a Single Logical Ring Using a Virtual Token and Logical ClocksWeijia Jia, Jiannong Cao, To-Yat Cheung. 85-92
- Parallel I/O: A Set of Interwined Systems and Applications IssuesPaul Messina. 85-90
- Synchronization Elimination in the Deposit ModelSusan Hinrichs. 87-94
- Integer Sorting and Routing in Arrays with Reconfigurable Optical BusesSandy Pavel, Selim G. Akl. 90-94
- Models for Parallel ComputationSusanne E. Hambrusch. 92-95
- Fault-Tolerant Multicast in Hypercube MulticomputersGe-Ming Chiu, Kai-Shung Chen. 93-96
- Prefetching and Caching for Query Scheduling in a Special Class of Distributed ApplicationsAman Sinha, Craig M. Chase. 95-102
- Algorithms for Sorting Arbitrary InputUsing a Fixed-Size Parallel Sorting DeviceSi-Qing Zheng. 95-99
- Linguistic Constructs for BSP Style ProgrammingThomas Cheatham. 96-102
- Partition and Task Migration on k-Extra-Stage Omega NetworksXiaojun Shen, Yixin Zhang. 97-100
- A Spatial-Temporal Parallel Approach for Real-Time MPEG Video CompressionKe Shen, Edward J. Delp. 100-107
- Benchmarking Message Passing Performance using MPILok T. Liu, David E. Culler, Chad Yoshikawa. 101-110
- Program Analysis for Cache Coherence: Beyond Procedural BoundariesLynn Choi, Pen-Chung Yew. 103-113
- What Good are Shared-Memory Models?Phillip B. Gibbons. 103-114
- Design and Implementation of NX Message Passing Using Shrimp Virtual Memory Mapped CommunicationRichard Alpert, Cezary Dubnicki, Edward W. Felten, Kai Li. 111-119
- A Timestamp-based Selective Invalidation Scheme for Multiprocessor Cache CoherenceXin Yuan, Rami G. Melhem, Rajiv Gupta. 114-121
- On Combining Technology and Theory in Search of a Parallel Computation ModelJoseph JáJá. 115-123
- A Parallel Algorithm for Scientific VisualizationGünter Knittel. 116-123
- A Priority-Based Flow Control Mechanism to Support Real-Time Traffic in Pipelined Direct NetworksShobana Balakrishnan, Füsun Özgüner. 120-127
- Scheduling of Wavefront Parallelism on Scalable Shared-memory MultiprocessorsNaraig Manjikian, Tarek S. Abdelrahman. 122-131
- Parallel Processors for Synyjetic Aperture Radar ImagingPeter G. Meisl, Mabo Robert Ito, Ian G. Cumming. 124-131
- Portable Parallel Programming LanguagesRudolf Eigenmann. 125-131
- Efficient and Flexible Object SharingMiguel Castro, Manuel Sequeira, Manuel Costa, Paulo Guedes. 128-137
- Automatic Self-Allocating Threads (ASAT) on an SGI ChallengeCharles Severance, Richard J. Enbody. 132-139
- Efficient Algorithms for Estimating Atmosperic Parameters for Surface Reflectance RetrievalHassan Fallah-Adl, Joseph JáJá, Shunlin Liang. 132-141
- Portable Parallel Programming in HPC++Peter H. Beckman, Dennis Gannon, Elizabeth Johnson. 132-139
- Reducing Cache Invalidation Overheads in Wormhole Routed DSMs Using Multidestination Message PassingDonglai Dai, Dhabaleswar K. Panda. 138-145
- A Hydro-Dynamic Approach to Heterogeneous Dynamic Load Balancing in a Network of ComputersChi-Chung Hui, Samuel T. Chanson. 140-147
- Fortran: A Modern Standard Programming Language For Parallel Scalable High Performance Technical ComputingDavid B. Loveman. 140-148
- Synthesizing Efficient Out-of-Core Programs for Block Recursive Algorithms Using Block-Cyclic Data DistributionsZhiyong Li, John H. Reif, Sandeep K. Gupta. 142-149
- Software-Based Communication Latency Hiding for Commodity Workstation NetworksVolker Strumpen. 146-153
- A Load-Balancing Algorithms for N-CubesMin-You Wu, Wei Shu. 148-155
- Restructuring Programs for High-Speed Computers with PolarisWilliam Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jaejin Lee, Thomas Lawrence, Jay Hoeflinger, David A. Padua, Yunheung Paek, Paul Petersen, William M. Pottenger, Lawrence Rauchwerger, Peng Tu, Stephen Weatherford. 149-161
- FAST: A Low-Complexity Algorithm for Efficient Scheduling of DAGs on Parallel ProcessorsYu-Kwong Kwok, Ishfaq Ahmad, Jun Gu. 150-157
- Reducing Conflicts in Direct-Mapped Caches with a Temporality-Based DesignJude A. Rivers, Edward S. Davidson. 154-163
- Efficient Reliable Multicast on MyrinetKees Verstoep, Koen Langendoen, Henri E. Bal. 156-165
- 3-D Land Avoidance and Load Balancing in Regional Ocean SimulationLuiz De Rose, Kyle Gallivan, Efstratios Gallopoulos. 158-165
- A Hybrid cache Coherence Protocol for a Decoupled Multi-Channel Optical Network: SPEED DMONJoon-Ho Ha, Timothy Mark Pinkston. 164-171
- A Flexible Processor Allocation Strategy for Mesh Connected Parallel SystemsVipul Gupta, Arun Jayendran. 166-173
- Analysis of Heart Rate Variability on a Massively Parallel ProcessorSuchendra M. Bhandarkar, Sridhar Chirravuri, David Whitmire. 166-169
- Parallel Implementation of Cone Beam TomographyDavid A. Reimann, Vipin Chaudhary, Michael J. Flynn, Ishwar K. Sethi. 170-173
- An Efficient Hybrid Cache Coherence Protocol for Shared Memory MultiprocessorsYeimkuan Chang, Laxmi N. Bhuyan. 172-179
- Construction of Optimal Multicast Trees Based on the Parameterized Communication ModelJu-Young Lee Park, Hyeong-Ah Choi, Natawut Nupairoj, Lionel M. Ni. 180-187
- Load Balancing for Parallel Loops in Workstation ClustersTae Hyung Kim, James M. Purtilo. 182-190
- Minimizing Node Contention in Multiple Multicast on Wormhole k-ary N-Cube NetworksRam Kesavan, Dhabaleswar K. Panda. 188-195
- Performance Analysis of Task Migration in a Portable Parallel EnvironmentBalkrishna Ramkumar, Gopal Chillariga. 191-198
- An Efficient Distributed Mutual Exclusion AlgorithmNiki Pissinou, Kia Makki, E. K. Park, Z. Hu, W. Wong. 196-203
- Dynamic Task Scheduling and Allocation for 3D Torus Multicomputer SystemsHee Yong Youn, Hyunseung Choo, Seong-Moo Yoo, Behrooz Shirazi. 199-206
- Improving the I/O Performance of Real-Time Database Systems with Multiple-Disk Storage StructuresAlbert Mo Kim Cheng, Sharon X. Gu. 204-211
- A Novel Algorithm for Buddy-Subcube Compaction in HypercubesHsing-Lung Chen, Shu-Hua Hu. 207-214
- Implementation and Performance Evaluation of the Parallel Relational Database Server SDC-IITakayuki Tamura, Minoru Nakamura, Masaru Kitsuregawa, Yoshihisa Ogawa. 212-221
- MpPVM: A Software System for Non-Dedicated Heterogeneous ComputingKasidit Chanchio, Xian-He Sun. 215-222
- Simulating Message-Driven ProgramsAttila Gürsoy, Laxmikant V. Kalé. 223-230
- Exploiting Instruction Level Parallelism with the DS ArchitectureYinong Zhang, George B. Adams III. 230-237
- A Fine-Grain Parallel Architecture Based on Barrier SynchronizationHenry G. Dietz, Raymond Hoare, Timothy Mattox. 247-250
- Minimizing Communication of a Recirculating Bitonic Sorting NetworkJae-dong Lee, Kenneth E. Batcher. 251-254
- Parallel and Distributed Meldable Priority Queues Based on Binomial HeapsVincenzo A. Crupi, Sajal K. Das, Maria Cristina Pinotti. 255-262
- A Scalable Cache Design for I-Structures in Multithreaded ArchitecturesJean-Luc Gaudiot, Chung-Ta Cheng. 263-266
- An Optimal Routing Policy for Mesh-Connected TopologiesJie Wu. 267-270
- Designing Processor-Cluster Based Systems: Interplay Between Organizations and Broadcasting AlgorithmsDebashis Basak, Dhabaleswar K. Panda. 271-274
- A Framework for Building Distributed Dynamic ApplicationsNiki Pissinou, B. K. Rajashekhar, Kia Makki, Kanonkluk Vanapipat. 275-278