Abstract is missing.
- LAPACK: a portable linear algebra library for high-performance computersEd Anderson, Zhaojun Bai, Jack Dongarra, A. Greenbaum, A. McKenney, Jeremy Du Croz, Sven Hammarling, James Demmel, Christian H. Bischof, Danny C. Sorensen. 2-11 [doi]
- Multilinear algebra and parallel programmingRodney W. Johnson, Chua-Huang Huang, John R. Johnson. 20-31 [doi]
- The impact of memory organization on the performance of matrix multiplicationJürgen-Friedrich Hake, Willi Homberg. 34-40 [doi]
- On randomly interleaved memoriesRam Raghavan, John P. Hayes. 49-58 [doi]
- Tracing application program execution on the Cray X-MP and Cray 2Allen D. Malony, John L. Larson, Daniel A. Reed. 60-73 [doi]
- Parallel program debugging with on-the-fly anomaly detectionRobert Hood, Ken Kennedy, John M. Mellor-Crummey. 74-81 [doi]
- Improving instruction cache behavior by reducing cache pollutionRajiv Gupta, Chi-Hung Chi. 82-91 [doi]
- A parallel Monte Carlo search algorithm for the conformational analysis of proteinsDaniel R. Ripoll, Stephen J. Thomas. 94-102 [doi]
- A parallel computational approach using a cluster of IBM ES/3090 600Js for physical mapping of chromosomesSteven W. White, David C. Torney, Clive C. Whittaker. 112-121 [doi]
- Experience with a performance analyzer for multithreaded applicationsGilbert J. Hansen, Charles A. Linthicum, Gary Brooks. 124-131 [doi]
- Performance evaluation of the IBM RISC System/6000: comparison of an optimized scalar processor with two vector processorsMargaret L. Simmons, Harvey J. Wasserman. 132-141 [doi]
- The characterization of two scientific workloads using the CRAY X-MP performance monitorElizabeth Williams, C. Thomas Myers, Rebecca Koskela. 142-152 [doi]
- Cost-performance analysis of heterogeneity in supercomputer architecturesDaniel A. Menascé, Virgilio Almeida. 169-177 [doi]
- Fast barrier synchronization hardwareCarl J. Beckmann, Constantine D. Polychronopoulos. 180-189 [doi]
- Switch-stacks: a scheme for microtasking nested parallel loopsJyh-Herng Chow, Williams Ludwell Harrison III. 190-199 [doi]
- Parallelization of loops with exits on pipelined architecturesParthasarathy P. Tirumalai, M. Lee, Michael S. Schlansker. 200-212 [doi]
- High performance preconditioning on supercomputers for the 3D device simulator MINIMOSKarl P. Traar, Wolfgang R. Mader, Otto Heinreichsberger, Siegfried Selberherr, Martin Stiftinger. 224-231 [doi]
- Techniques for improving the performance of sparse matrix factorization on multiprocessor workstationsEdward Rothberg, Anoop Gupta. 232-241 [doi]
- Uni-directional hypercubesChih-Hsiang Chou, David Hung-Chang Du. 254-263 [doi]
- Design and analysis of buffered crossbars and banyans with cut-through switchingTed H. Szymanski, Chien Fang. 264-273 [doi]
- A parallel object-oriented total architecture: A-NETTakanobu Baba, Tsutomu Yoshinaga, Tohru Iijima, Yoshifumi Iwamoto, Masahiro Hamada, Mitsuru Suzuki. 276-285 [doi]
- A parallel computer model supporting procedure-based communicationJeffrey X. Zhou. 286-294 [doi]
- A high-performance, memory-based interconnection system for multicomputer environmentsCreve Maples. 295-304 [doi]
- Another view on parallel speedupXian-He Sun, Lionel M. Ni. 324-333 [doi]
- Group graphs and computational symmetry on massively parallel architectureLewis Stiller. 344-353 [doi]
- Chess and supercomputers: details about optimizing Cray BlitzRobert M. Hyatt, Harry L. Nelson. 354-363 [doi]
- Experiences in building the Clemson computational sciences programD. E. Stevenson, R. M. Panoff. 366-375 [doi]
- A real introduction to supercomputing: a user training courseFloyd B. Hanson. 376-385 [doi]
- Loop displacement: an approach for transforming and scheduling loops for parallel executionRajiv Gupta. 388-397 [doi]
- A compiler-assisted approach to SPMD executionRon Cytron, Jim Lipkis, Edith Schonberg. 398-406 [doi]
- Loop distribution with arbitrary control flowKen Kennedy, Kathryn S. McKinley. 407-416 [doi]
- Large-scale computing on clustered vector multiprocessorsAladin Kamel, Piero Sguazzero, Vittorio Zecca. 418-427 [doi]
- A vectorized 3-D finite element model for transient simulation of two-phase heat transport with phase transformation and a moving interfaceMark Christon. 436-445 [doi]
- Parallelization of a radiation transport simulation code on the BBN TC2000 parallel computerWilliam Celmaster, Edward N. May. 448-454 [doi]
- Perfect Benchmarks decomposition and performance on VAX multiprocessorsZarka Cvetanovic, Edward G. Freedman, Charles Nofsinger. 455-474 [doi]
- Embedding meshes on the star graphSanjay Ranka, Jhy-Chun Wang, Nangkang Yeh. 476-485 [doi]
- Logarithmic time cost optimal parallel sorting is ::::not yet:::: fast in practice!Lasse Natvig. 486-494 [doi]
- A simple and correct shared-queue algorithm using compare-and-swapJanice M. Stone. 495-504 [doi]
- Fine-grain parallelism in the ALPS programming languagePrasad Vishnubhotla. 506-514 [doi]
- Delirium: an embedding coordination languageSteven Lucco, Oliver Sharp. 515-524 [doi]
- UC: a language for the connection machineRajive Bagrodia, K. Mani Chandy, E. Kwan. 525-534 [doi]
- Parallel algorithm research at CERFACSIain S. Duff. 536-542 [doi]
- Cache coherence in systems with parallel communication channels many processorsJohn C. Willis, Arthur C. Sanderson, Charles R. Hill. 554-563 [doi]
- Data cache performance of supercomputer applicationsDavid Callahan, Allan Porterfield. 564-572 [doi]
- Resource binding - a universal approach to parallel programmingHonda Shing, Lionel M. Ni. 574-583 [doi]
- A flexible communication abstraction for nonshared memory parallel computingGail A. Alverson, William G. Griswold, David Notkin, Lawrence Snyder. 584-593 [doi]
- Implementation machine paradigm for parallel programmingManohar Rao, Zary Segall, Dalibor F. Vrsalovic. 594-603 [doi]
- Efficient parallel logic simulation techniques for the connection machineMoon-Jung Chung, Yunmo Chung. 606-614 [doi]
- Design of a scalable parallel switch-level simulator for VLSIRobert B. Mueller-Thuns, Daniel G. Saab, Jacob A. Abraham. 615-624 [doi]
- SISAL versus FORTRAN: a comparison using the Livermore loopsDavid C. Cann, John Feo. 626-636 [doi]
- Experimental analysis of communication/data-conditional aspects of a mixed-mode parallel architecture via synthetic computationsSamuel A. Fineberg, Thomas L. Casavant, Howard Jay Siegel. 637-646 [doi]
- Performance evaluation of mesh-connected wormhole-routed networks for interprocessor communication in multicomputersSuresh Chittor, Richard J. Enbody. 647-657 [doi]
- Theorem proving in propositional logic on vector computers using a generalized Davis-Putnam procedureWen-Tsuen Chen, Ming-Yi Fang. 658-665 [doi]
- Scan primitives for vector computersSiddhartha Chatterjee, Guy E. Blelloch, Marco Zagha. 666-675 [doi]
- A vectorized long-period shift-register random number generatorSalvatore Filippone, Paolo Santangelo, Marcello Vitaletti. 676-684 [doi]
- Architectural support for register allocation in the presence of aliasingBen Heggy, Mary Lou Soffa. 730-739 [doi]
- Information optimization for Monte Carlo data and application to high-temperature quantum chromodynamicsS. Huang, K. J. M. Moriarty, E. Ann Myers, J. Potvin. 742-747 [doi]
- An optional hypercube direct ::::N::::-body solver on the connection machineJean-Philippe Brunet, Alan Edelman, Jill P. Mesirov. 748-752 [doi]
- Monte Carlo simulation of the Ising model and random number generation on the vector processorNobuyasu Ito, Yasumasa Kanada. 753-763 [doi]
- P3D: a Lisp-based format for representing general 3D modelsJoel Welling, Chris Nuuja, Phil Andrews. 766-774 [doi]
- Scientific data visualization: a formal introduction to the rendering and geometric modeling aspectsVincent J. Harrand, Amar Choudry, John P. Ziebarth. 775-783 [doi]
- Run-time monitoring of concurrent programs on the Cedar multiprocessorSanjay Sharma, Allen D. Malony, Michael W. Berry, Priyamvada Sinvhal-Sharma. 784-793 [doi]
- Future general purpose supercomputer architecturesJames E. Smith, Wei-Chung Hsu, Christopher C. Hsiung. 796-804 [doi]
- Time dilation visualization in relativityPing-Kang Hsiung, Robert H. Thibadeau, Christopher B. Cox, Robert H. P. Dunn. 835-844 [doi]
- Partitioning declarative programs into communicating processesJohn M. A. Roy, Mark Nagel, Lubomir Bic. 846-855 [doi]
- Generating explicit communication from shared-memory program referencesJingke Li, Marina C. Chen. 865-876 [doi]
- A network-topology independent task allocation strategy for parallel computersTakanobu Baba, Yoshifumi Iwamoto, Tsutomu Yoshinaga. 878-887 [doi]
- Heuristic methods for dynamic load balancing in a message-passing supercomputerJian Xu, Kai Hwang. 888-897 [doi]
- A semi distributed task allocation strategy for large hypercube supercomputersIshfaq Ahmad, Arif Ghafoor. 898-907 [doi]
- Architecture and implementation of a VLIW supercomputerRobert P. Colwell, W. Eric Hall, Chandra S. Joshi, David B. Papworth, Paul K. Rodman, James E. Tornes. 910-919 [doi]
- The design of a RISC based multiprocessor chipRajiv Gupta, Michael Epstein, Michael Whelan. 920-929 [doi]
- Soviet high-speed computers: the new generationPeter Wolcott, Seymour E. Goodman. 930-939 [doi]
- Performing data flow analysis in parallelYong-Fong Lee, Thomas J. Marlowe, Barbara G. Ryder. 942-951 [doi]
- Experience with interprocedural analysis of array side effectsPaul Havlak, Ken Kennedy. 952-961 [doi]
- Subdomain dependence test for massive parallelismLee-Chung Lu, Marina C. Chen. 962-972 [doi]