Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29 - December 1, 1995 - researchr publication

researchr

You are not signed in
Sign in
Sign up

Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29 - December 1, 1995. ACM/IEEE, 1995.

Conference: euromicro1995

Abstract is missing.

Performance issues in correlated branch prediction schemesNicholas C. Gloy, Michael D. Smith, Cliff Young. 3-14 [doi]

Dynamic path-based branch correlationRavi Nair. 15-23 [doi]

The predictability of branches in librariesBrad Calder, Dirk Grunwald, Amitabh Srivastava. 24-34 [doi]

The performance impact of incomplete bypassing in processor pipelinesPritpal S. Ahuja, Douglas W. Clark, Anne Rogers. 36-45 [doi]

Efficient instruction scheduling using finite state automataVasanth Bala, Norman Rubin. 46-56 [doi]

Critical path reduction for scalar programsMichael S. Schlansker, Vinod Kathail. 57-69 [doi]

A limit study of local memory requirements using value reuse profilesAndrew S. Huang, John Paul Shen. 71-81 [doi]

Zero-cycle loads: microarchitecture support for reducing load latencyTodd M. Austin, Gurindar S. Sohi. 82-92 [doi]

A modified approach to data cache managementGary S. Tyson, Matthew K. Farrens, John Matthews, Andrew R. Pleszkun. 93-103 [doi]

Petri net versus modulo scheduling for software pipeliningVicki H. Allan, U. R. Shah, K. M. Reddy. 105-110 [doi]

Modulo scheduling with multiple initiation intervalsNancy J. Warter-Perez, Noubar Partamian. 111-119 [doi]

Spill-free parallel scheduling of basic blocksB. Natarajan, Michael S. Schlansker. 119-124 [doi]

Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguationJack W. Davidson, Sanjay Jinturkar. 125-132 [doi]

Self-regulation of workload in the Manchester Data-Flow computerJohn R. Gurd, David F. Snelling. 135-145 [doi]

The M-Machine multicomputerMarco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny Gurevich, Whay Sing Lee. 146-156 [doi]

Region-based compilation: an introduction and motivationRichard E. Hank, Wen-mei W. Hwu, B. Ramakrishna Rau. 158-168 [doi]

An experimental study of several cooperative register allocation and instruction scheduling strategiesCindy Norris, Lori L. Pollock. 169-179 [doi]

Register allocation for predicated codeAlexandre E. Eichenberger, Edward S. Davidson. 180-191 [doi]

Partial resolution in branch target buffersBarry S. Fagin, Kathryn Russell. 193-198 [doi]

A system level perspective on branch architecture performanceBrad Calder, Dirk Grunwald, Joel S. Emer. 199-206 [doi]

Dynamic rescheduling: a technique for object code compatibility in VLIW architecturesThomas M. Conte, Sumedh W. Sathaye. 208-218 [doi]

Improving CISC instruction decoding performance using a fill unitMark Smotherman, Manoj Franklin. 219-229 [doi]

SPAID: software prefetching in pointer- and call-intensive environmentsMikko H. Lipasti, William J. Schmidt, Steven R. Kunkel, Robert R. Roediger. 231-236 [doi]

An effective programmable prefetch engine for on-chip cachesTien-Fu Chen. 237-242 [doi]

Cache miss heuristics and preloading techniques for general-purpose programsToshihiro Ozawa, Yasunori Kimura, Shin ichiro Nishizaki. 243-248 [doi]

Alternative implementations of hybrid branch predictorsPo-Ying Chang, Eric Hao, Yale N. Patt. 252-257 [doi]

Control flow prediction with tree-like subgraphs for superscalar processorsSimonjit Dutta, Manoj Franklin. 258-263 [doi]

The role of adaptivity in two-level adaptive branch predictionStuart Sechrest, Chih-Chieh Lee, Trevor N. Mudge. 264-269 [doi]

Design of storage hierarchy in multithreaded architecturesLucas Roh, Walid A. Najjar. 271-278 [doi]

An investigation of the performance of various instruction-issue buffer topologiesStéphan Jourdan, Pascal Sainrat, Daniel Litaize. 279-284 [doi]

Decoupling integer execution in superscalar processorsSubbarao Palacharla, James E. Smith. 285-290 [doi]

Exploiting short-lived variables in superscalar processorsLuis A. Lozano, Guang R. Gao. 292-302 [doi]

Partitioned register file for TTAsJohan Janssen, Henk Corporaal. 303-312 [doi]

Disjoint eager execution: an optimal form of speculative executionAugustus K. Uht, Vijay Sindagi, Kelley Hall. 313-325 [doi]

Unrolling-based optimizations for modulo schedulingDaniel M. Lavery, Wen-mei W. Hwu. 327-337 [doi]

Stage scheduling: a technique to reduce the register requirements of a modulo scheduleAlexandre E. Eichenberger, Edward S. Davidson. 338-349 [doi]

Hypernode reduction modulo schedulingJosep Llosa, Mateo Valero, Eduard Ayguadé, Antonio González. 350-360 [doi]

runs on WebDSL