Abstract is missing.
- Global Context-Based Value PredictionTarun Nakra, Rajiv Gupta, Mary Lou Soffa. 4-12 [doi]
- Dynamically Exploiting Narrow Width Operands to Improve Processor Power and PerformanceDavid Brooks, Margaret Martonosi. 13-22 [doi]
- Improving the Accuracy vs. Speed Tradeoff for Simulating Shared-Memory Multiprocessors with ILP ProcessorsMurthy Durbhakula, Vijay S. Pai, Sarita V. Adve. 23-32 [doi]
- Memory Hierarchy Considerations for Fast Transpose and Bit-ReversalsKang Su Gatlin, Larry Carter. 33 [doi]
- Instruction Recycling on a Multiple-Path ProcessorSteven Wallace, Dean M. Tullsen, Brad Calder. 44-53 [doi]
- Supporting Fine-Grained Synchronization on a Simultaneous Multithreading ProcessorDean M. Tullsen, Jack L. Lo, Susan J. Eggers, Henry M. Levy. 54-58 [doi]
- The Synergy of Multithreading and Access/Execute DecouplingJoan-Manuel Parcerisa, Antonio González. 59-63 [doi]
- Out-of-Order Execution may not be Cost-Effective on Processors Featuring Simultaneous MultithreadingSébastien Hily, André Seznec. 64 [doi]
- Impulse: Building a Smarter Memory ControllerJohn B. Carter, Wilson C. Hsieh, Leigh Stoller, Mark R. Swanson, Lixin Zhang, Erik Brunvand, Al Davis, Chen-Chi Kuo, Ravindra Kuramkote, Michael Parker, Lambert Schaelicke, Terry Tateyama. 70-79 [doi]
- Access Order and Effective Bandwidth for Streams on a Direct Rambus MemorySung I. Hong, Sally A. McKee, Maximo H. Salinas, Robert H. Klenke, James H. Aylor, William A. Wulf. 80-89 [doi]
- Lightweight Hardware Distributed Shared Memory Supported by Generalized CombiningKiyofumi Tanaka, Takashi Matsumoto, Kei Hiraki. 90 [doi]
- Exploiting Basic Block Value Locality with Block ReuseJian Huang, David J. Lilja. 106-114 [doi]
- A Study of Control Independence in Superscalar ProcessorsEric Rotenberg, Quinn Jacobson, James E. Smith. 115-124 [doi]
- Instruction Pre-Processing in Trace ProcessorsQuinn Jacobson, James E. Smith. 125-129 [doi]
- Distributed Modulo SchedulingMarcio Merino Fernandes, Josep Llosa, Nigel P. Topham. 130-134 [doi]
- Switch Cache: A Framework for Improving the Remote Memory Access Latency of CC-NUMA MultiprocessorsRavi R. Iyer, Laxmi N. Bhuyan. 152-160 [doi]
- Improving CC-NUMA Performance Using Instruction-Based PredictionStefanos Kaxiras, James R. Goodman. 161 [doi]
- WildFire: A Scalable Path for SMPsErik Hagersten, Michael Koster. 172-181 [doi]
- Parallel Dispatch Queue: A Queue-Based Programming Abstraction to Parallelize Fine-Grain Communication ProtocolsBabak Falsafi, David A. Wood. 182-192 [doi]
- Limits to the Performance of Software Shared Memory: A Layered ApproachAngelos Bilas, Dongming Jiang, Yuanyuan Zhou, Jaswinder Pal Singh. 193 [doi]
- RAPID-Cache - A Reliable and Inexpensive Write Cache for Disk I/O SystemsYiming Hu, Qing Yang, Tycho Nightingale. 204-213 [doi]
- Permutation Development Data Layout (PDDL)Thomas J. E. Schwarz, Jesse Steinberg, Walter A. Burkhard. 214-217 [doi]
- Dynamically Variable Line-Size Cache Exploiting High On-Chip Memory Bandwidth of Merged DRAM/Logic LSIsKoji Inoue, Koji Kai, Kazuaki Murakami. 218-222 [doi]
- A Scalable Cache Coherent Scheme Exploiting Wormhole Routing NetworksYunseok Rhee, Joonwon Lee. 223 [doi]
- The Impact of Link Arbitration on Switch PerformanceMarius Pirvu, Laxmi N. Bhuyan, Nan Ni. 228-235 [doi]
- LAPSES: A Recipe for High Performance Adaptive Router DesignAniruddha S. Vaidya, Anand Sivasubramaniam, Chita R. Das. 236-243 [doi]
- Sensitivity of Parallel Applications to Large Differences in Bandwidth and Latency in Two-Layer InterconnectsAske Plaat, Henri E. Bal, Rutger F. H. Hofman. 244 [doi]
- Comparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared MemorySandhya Dwarkadas, Kourosh Gharachorloo, Leonidas I. Kontothanassis, Daniel J. Scales, Michael L. Scott, Robert Stets. 260-269 [doi]
- Using Lamport Clocks to Reason about Relaxed Memory ModelsAnne Condon, Mark D. Hill, Manoj Plakal, Daniel J. Sorin. 270-278 [doi]
- A Performance Comparison of Homeless and Home-Based Lazy Release Consistency Protocols in Software Shared MemoryAlan L. Cox, Eyal de Lara, Y. Charlie Hu, Willy Zwaenepoel. 279-283 [doi]
- MP-LOCKs: Replacing H/W Synchronization Primitives with Message PassingChen-Chi Kuo, John B. Carter, Ravindra Kuramkote. 284 [doi]
- Efficient All-to-All Broadcast in All-Port Mesh and Torus NetworksYuanyuan Yang, Jianchao Wang. 290-299 [doi]
- MMR: A High-Performance Multimedia Router - Architecture and Design Trade-OffsJosé Duato, Sudhakar Yalamanchili, Blanca Caminero, Damon S. Love, Francisco J. Quiles. 300-309 [doi]
- Communication Studies of Single-Threaded and Multithreaded Distributed-Memory MultiprocessorsAndrew Sohn, Yunheung Paek, Jui-Yuan Ku, Yuetsu Kodama, Yoshinori Yamaguchi. 310-314 [doi]
- Impact of Buffer Size on the Efficiency of Deadlock DetectionJuan Miguel Martínez, Pedro López, José Duato. 315 [doi]
- Third Workshop on Communication, Architecture, and Applications for Network-Based Parallel Computing (CANPC 99)Anand Sivasubramaniam, Mario Lauria. 320 [doi]
- Fifth Annual Workshop on Computer EducationDavid R. Kaeli, Bruce Jacobs. 320 [doi]
- Multithreaded Execution Architecture and CompilationDean M. Tullsen, Guang R. Gao. 321 [doi]
- Parallel Computing for Irregular ApplicationsJacques Chassin de Kergommeaux, Yves Denneulin, Thierry Gautier. 321 [doi]