Abstract is missing.
- A scalable approach to thread-level speculationJ. Gregory Steffan, Christopher B. Colohan, Antonia Zhai, Todd C. Mowry. 1-12 [doi]
- Architectural support for scalable speculative parallelization in shared-memory multiprocessorsMarcelo H. Cintra, José F. Martínez, Josep Torrellas. 13-24 [doi]
- Transient fault detection via simultaneous multithreadingSteven K. Reinhardt, Shubhendu S. Mukherjee. 25-36 [doi]
- Trace preconstructionQuinn Jacobson, James E. Smith. 37-46 [doi]
- A hardware mechanism for dynamic extraction and relayout of program hot spotsMatthew C. Merten, Andrew R. Trick, Erik M. Nystrom, Ronald D. Barnes, Wen-mei W. Hwu. 59-70 [doi]
- Wattch: a framework for architectural-level power analysis and optimizationsDavid Brooks, Vivek Tiwari, Margaret Martonosi. 83-94 [doi]
- Energy-driven integrated hardware-software optimizations using SimplePowerNarayanan Vijaykrishnan, Mahmut T. Kandemir, Mary Jane Irwin, Hyun Suk Kim, Wu Ye. 95-106 [doi]
- A fully associative software-managed cache designErik G. Hallnor, Steven K. Reinhardt. 107-116 [doi]
- Recency-based TLB preloadingAshley Saulsbury, Fredrik Dahlgren, Per Stenström. 117-127 [doi]
- Memory access schedulingScott Rixner, William J. Dally, Ujval J. Kapasi, Peter R. Mattson, John D. Owens. 128-138 [doi]
- Selective, accurate, and timely self-invalidation using last-touch predictionAn-Chow Lai, Babak Falsafi. 139-148 [doi]
- An embedded DRAM architecture for large-scale spatial-lattice computationsNorman Margolus. 149-160 [doi]
- Smart Memories: a modular reconfigurable architectureKen Mai, Tim Paaske, Nuwan Jayasena, Ron Ho, William J. Dally, Mark Horowitz. 161-171 [doi]
- Understanding the backward slices of performance degrading instructionsCraig B. Zilles, Gurindar S. Sohi. 172-181 [doi]
- On the value locality of store instructionsKevin M. Lepak, Mikko H. Lipasti. 182-191 [doi]
- Performance analysis of the Alpha 21264-based Compaq ES40 systemZarka Cvetanovic, Richard E. Kessler. 192-202 [doi]
- Lx: a technology platform for customizable VLIW embedded processingPaolo Faraboschi, Geoffrey Brown, Joseph A. Fisher, Giuseppe Desoli, Fred Homewood. 203-213 [doi]
- Reconfigurable caches and their application to media processingParthasarathy Ranganathan, Sarita V. Adve, Norman P. Jouppi. 214-224 [doi]
- CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unitZhi Alex Ye, Andreas Moshovos, Scott Hauck, Prithviraj Banerjee. 225-235 [doi]
- Circuits for wide-window superscalar processorsDana S. Henry, Bradley C. Kuszmaul, Gabriel H. Loh, Rahul Sami. 236-247 [doi]
- Clock rate versus IPC: the end of the road for conventional microarchitecturesVikas Agarwal, M. S. Hrishikesh, Stephen W. Keckler, Doug Burger. 248-259 [doi]
- Vector instruction set support for conditional operationsJames E. Smith, Greg Faanes, Rabin A. Sugumar. 260-269 [doi]
- Instruction path coprocessorsYuan C. Chou, John Paul Shen. 270-281 [doi]
- Piranha: a scalable architecture based on single-chip multiprocessingLuiz André Barroso, Kourosh Gharachorloo, Robert McNamara, Andreas Nowatzyk, Shaz Qadeer, Barton Sano, Scott Smith, Robert Stets, Ben Verghese. 282-293 [doi]
- Early load address resolution via register trackingMichael Bekerman, Adi Yoaz, Freddy Gabbay, Stéphan Jourdan, Maxim Kalaev, Ronny Ronen. 306-315 [doi]