Abstract is missing.
- Reducing latencies of pipelined cache accesses through set predictionAneesh Aggarwal. 2-11 [doi]
- Characterization of L3 cache behavior of SPECjAppServer2002 and TPC-CEriko Nurvitadhi, Nirut Chalainanont, Shih-Lien Lu. 12-20 [doi]
- A hybrid hardware/software approach to efficiently determine cache coherence BottlenecksJaydeep Marathe, Frank Mueller, Bronis R. de Supinski. 21-30 [doi]
- A NUCA substrate for flexible CMP cache sharingJaehyuk Huh, Changkyu Kim, Hazim Shafi, Lixin Zhang, Doug Burger, Stephen W. Keckler. 31-40 [doi]
- Fast branch misprediction recovery in out-of-order superscalar processorsPeng Zhou, Soner Önder, Steve Carr. 41-50 [doi]
- Tornado warning: the perils of selective replay in multithreaded processorsYongxiang Liu, Anahita Shayesteh, Gokhan Memik, Glenn Reinman. 51-60 [doi]
- An asymmetric clustered processor based on value contentRuben Gonzalez, Adrián Cristal, Miquel Pericàs, Mateo Valero, Alexander V. Veidenbaum. 61-70 [doi]
- A heterogeneously segmented cache architecture for a packet forwarding engineKaushik Rajan, R. Govindarajan. 71-80 [doi]
- Low-overhead call path profiling of unmodified, optimized codeNathan Froyd, John M. Mellor-Crummey, Robert J. Fowler. 81-90 [doi]
- Design of a next generation sampling service for large scale data analysis applicationsHuai Wang, Srinivasan Parthasarathy, Amol Ghoting, Shirish Tatikonda, Gregory Buehrer, Tahsin M. Kurç, Joel H. Saltz. 91-100 [doi]
- Online performance analysis by statistical sampling of microprocessor performance countersReza Azimi, Michael Stumm, Robert W. Wisniewski. 101-110 [doi]
- Improved automatic testcase synthesis for performance model validationRobert H. Bell Jr., Lizy Kurian John. 111-120 [doi]
- Automatic thread distribution for nested parallelism in OpenMPAlejandro Duran, Marc González, Julita Corbalán. 121-130 [doi]
- Lightweight reference affinity analysisXipeng Shen, Yaoqing Gao, Chen Ding, Roch Archambault. 131-140 [doi]
- Think globally, search locallyKamen Yotov, Keshav Pingali, Paul Stodghill. 141-150 [doi]
- Facilitating the search for compositions of program transformationsAlbert Cohen, Marc Sigler, Sylvain Girbal, Olivier Temam, David Parello, Nicolas Vasilache. 151-160 [doi]
- Generating new general compiler optimization settingsMasayo Haneda, Peter M. W. Knijnenburg, Harry A. G. Wijshoff. 161-168 [doi]
- An integrated simdization framework using virtual vectorsPeng Wu, Alexandre E. Eichenberger, Amy Wang, Peng Zhao. 169-178 [doi]
- Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilationJose Renau, James Tuck, Wei Liu, Luis Ceze, Karin Strauss, Josep Torrellas. 179-188 [doi]
- Towards automatic translation of OpenMP to MPIAyon Basumallik, Rudolf Eigenmann. 189-198 [doi]
- TAPE: a transactional application profiling environmentHassan Chafi, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, JaeWoong Chung, Lance Hammond, Christos Kozyrakis, Kunle Olukotun. 199-208 [doi]
- Low-power, low-complexity instruction issue using compiler assistanceMadhavi Gopal Valluri, Lizy Kurian John, Kathryn S. McKinley. 209-218 [doi]
- Thread-Level Speculation on a CMP can be energy efficientJose Renau, Karin Strauss, Luis Ceze, Wei Liu, Smruti R. Sarangi, James Tuck, Josep Torrellas. 219-228 [doi]
- Power-aware resource allocation in high-end systems via online simulationBarry Lawson, Evgenia Smirni. 229-238 [doi]
- The architecture of the HP Superdome shared-memory multiprocessorGary Gostin, Jean-Francois Collard, Kirby Collins. 239-245 [doi]
- Scaling physics and material science applications on a massively parallel Blue Gene/L systemGeorge Almási, Gyan Bhanot, Alan Gara, Manish Gupta, James C. Sexton, Robert Walkup, Vasily Bulatov, Andrew W. Cook, Bronis R. de Supinski, James N. Glosli, Jeffrey A. Greenough, François Gygi, Alison Kubota, Steve Louis, Thomas E. Spelce, Frederick H. Streitz, Peter L. Williams, Robert K. Yates, Charles Archer, José E. Moreira, Charles A. Rendleman. 246-252 [doi]
- Optimization of MPI collective communication on BlueGene/L systemsGeorge Almási, Philip Heidelberger, Charles Archer, Xavier Martorell, C. Christopher Erway, José E. Moreira, Burkhard D. Steinmacher-Burow, Yili Zheng. 253-262 [doi]
- Transparent caching with strong consistency in dynamic content web sitesCristiana Amza, Gokul Soundararajan, Emmanuel Cecchet. 264-273 [doi]
- Disk layout optimization for reducing energy consumptionSeung Woo Son, Guangyu Chen, Mahmut T. Kandemir. 274-283 [doi]
- Continuous Replica Placement schemes in distributed systemsThanasis Loukopoulos, Petros Lampsas, Ishfaq Ahmad. 284-292 [doi]
- A performance-conserving approach for reducing peak power consumption in server systemsWesley M. Felter, Karthick Rajamani, Tom W. Keller, Cosmin Rusu. 293-302 [doi]
- System noise, OS clock ticks, and fine-grained parallel applicationsDan Tsafrir, Yoav Etsion, Dror G. Feitelson, Scott Kirkpatrick. 303-312 [doi]
- Another approach to backfilled jobs: applying virtual malleability to expired windowsGladys Utrera, Julita Corbalán, Jesús Labarta. 313-322 [doi]
- High performance support of parallel virtual file system (PVFS2) over QuadricsWeikuan Yu, Shuang Liang, Dhabaleswar K. Panda. 323-331 [doi]
- The implications of working set analysis on supercomputing memory hierarchy designRichard C. Murphy, Arun Rodrigues, Peter M. Kogge, Keith D. Underwood. 332-340 [doi]
- Improving the computational intensity of unstructured mesh applicationsBrian S. White, Sally A. McKee, Bronis R. de Supinski, Brian Miller, Daniel J. Quinlan, Martin Schulz. 341-350 [doi]
- Parallel sparse LU factorization on second-class message passing platformsKai Shen. 351-360 [doi]
- Cache oblivious stencil computationsMatteo Frigo, Volker Strumpen. 361-366 [doi]
- Multigrain parallel Delaunay Mesh generation: challenges and opportunities for multithreaded architecturesChristos D. Antonopoulos, Xiaoning Ding, Andrey N. Chernikov, Filip Blagojevic, Dimitrios S. Nikolopoulos, Nikos Chrisochoides. 367-376 [doi]
- What is worth learning from parallel workloads?: a user and session based analysisJulia Zilber, Ofer Amit, David Talby. 377-386 [doi]
- affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA systemHenrik Löf, Sverker Holmgren. 387-392 [doi]
- Automatic generation and tuning of MPI collective communication routinesAhmad Faraj, Xin Yuan. 393-402 [doi]