TACO - researchr journal

researchr

You are not signed in
Sign in
Sign up

22	--	Michael R. Jantz, Prasad A. Kulkarni. 1
23	--	Xiangyu Dong, Norman P. Jouppi, Yuan Xie. A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies
24	--	Jishen Zhao, Guangyu Sun, Gabriel H. Loh, Yuan Xie. Optimizing GPU energy efficiency with 3D die-stacking graphics memory and reconfigurable memory interface
25	--	Chien-Chi Chen, Sheng-De Wang. An efficient multicharacter transition string-matching engine based on the aho-corasick algorithm
26	--	Yangchun Luo, Wei-Chung Hsu, Antonia Zhai. The design and implementation of heterogeneous multicore systems for energy-efficient speculative thread execution
27	--	Dyer Rolán, Basilio B. Fraguela, Ramon Doallo. 1
28	--	Samantika Subramaniam, Simon C. Steely Jr., William Hasenplaugh, Aamer Jaleel, Carl J. Beckmann, Tryggve Fossum, Joel S. Emer. Using in-flight chains to build a scalable cache coherence protocol
29	--	Daniel Sánchez, Yiannakis Sazeides, Juan M. Cebrian, José M. García 0001, Juan L. Aragón. Modeling the impact of permanent faults in caches
30	--	Sanghoon Lee 0006, James Tuck. Automatic parallelization of fine-grained metafunctions on a chip multiprocessor
31	--	Christophe Dubach, Timothy M. Jones, Edwin V. Bonilla. Dynamic microarchitectural adaptation using machine learning
32	--	Long Chen, Yanan Cao, Zhao Zhang. 3CC: A memory error protection scheme with novel address mapping for subranked and low-power memories
33	--	Yingying Tian, Samira Manabi Khan, Daniel A. Jiménez. Temporal-based multilevel correlating inclusive cache replacement
34	--	Qixiao Liu, Miquel Moretó, Victor Jiménez, Jaume Abella, Francisco J. Cazorla, Mateo Valero. Hardware support for accurate per-task energy metering in multicore systems
35	--	Sanyam Mehta, Gautham Beeraka, Pen-Chung Yew. Tile size selection revisited
36	--	Bogdan Prisacari, Germán Rodríguez, Cyriel Minkenberg, Torsten Hoefler. Fast pattern-specific routing for fat tree networks
37	--	Maximilien Breughe, Lieven Eeckhout. Selecting representative benchmark inputs for exploring microprocessor design spaces
38	--	Christoph Kerschbaumer, Eric Hennigan, Per Larsen, Stefan Brunthaler, Michael Franz. Information flow tracking meets just-in-time compilation
39	--	Rupesh Nasre. Time- and space-efficient flow-sensitive points-to analysis
40	--	Wenjia Ruan, Yujie Liu, Michael F. Spear. Boosting timestamp-based transactional memory by exploiting hardware cycle counters
41	--	Tanima Dey, Wei Wang, Jack W. Davidson, Mary Lou Soffa. ReSense: Mapping dynamic workloads of colocated multithreaded applications using resource sensitivity
42	--	Adrià Armejach, J. Rubén Titos Gil, Anurag Negi, Osman S. Unsal, Adrián Cristal. Techniques to improve performance in requester-wins hardware transactional memory
43	--	Myeongjae Jeon, Conglong Li, Alan L. Cox, Scott Rixner. Reducing DRAM row activations with eager read/write clustering
44	--	Zhijia Zhao, Michael Bebenita, Dave Herman, Jianhua Sun, Xipeng Shen. HPar: A practical parallel parser for HTML-taming HTML complexities for parallel parsing
45	--	Ehsan Totoni, Mert Dikmen, María Jesús Garzarán. Easy, fast, and energy-efficient object detection on heterogeneous on-chip architectures
46	--	Viacheslav V. Fedorov, Sheng Qiu, A. L. Narasimha Reddy, Paul V. Gratz. ARI: Adaptive LLC-memory traffic management
47	--	Cecilia González-Alvarez, Jennifer B. Sartor, Carlos Alvarez, Daniel Jiménez-González, Lieven Eeckhout. Accelerating an application domain with specialized functional units
48	--	Xiaolin Wang, Lingmei Weng, Zhenlin Wang, Yingwei Luo. Revisiting memory management on virtualized environments
49	--	Chuntao Jiang, Zhibin Yu, Hai Jin, Cheng-Zhong Xu, Lieven Eeckhout, Wim Heirman, Trevor E. Carlson, Xiaofei Liao. PCantorSim: Accelerating parallel architecture simulation through fractal-based sampling
50	--	Srdan Stipic, Vesna Smiljkovic, Osman S. Unsal, Adrián Cristal, Mateo Valero. Profile-guided transaction coalescing - lowering transactional overheads by merging transactions
51	--	Zhe Wang, Shuchang Shan, Ting Cao, Junli Gu, Yi Xu, Shuai Mu, Yuan Xie 0001, Daniel A. Jiménez. WADE: Writeback-aware dynamic cache management for NVM-based main memory system
52	--	Yong Li 0009, Yaojun Zhang, Hai Li, Yiran Chen, Alex K. Jones. C1C: A configurable, compiler-guided STT-RAM L1 cache
53	--	Naznin Fauzia, Venmugil Elango, Mahesh Ravishankar, J. Ramanujam, Fabrice Rastello, Atanas Rountev, Louis-Noël Pouchet, P. Sadayappan. Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential
54	--	Alen Bardizbanyan, Magnus Själander, David B. Whalley, Per Larsson-Edefors. Designing a practical data filter cache to improve both energy efficiency and performance
55	--	Andrei Hagiescu, Bing Liu 0013, R. Ramanathan, Sucheendra K. Palaniappan, Zheng Cui, Bipasa Chattopadhyay, P. S. Thiagarajan, Weng-Fai Wong. GPU code generation for ODE-based applications with phased shared-data access patterns
56	--	JungHee Lee, Chrysostomos Nicopoulos, Hyung Gyu Lee, Jongman Kim. TornadoNoC: A lightweight and scalable on-chip network architecture for the many-core era
57	--	Christos Strydis, Robert M. Seepers, Pedro Peris-Lopez, Dimitrios Siskos, Ioannis Sourdis. A system architecture, processor, and communication protocol for secure implants
58	--	Wonsub Kim, Yoonseo Choi, Haewoo Park. Fast modulo scheduler utilizing patternized routes for coarse-grained reconfigurable architectures
59	--	Dorit Nuzman, Revital Eres, Sergei Dyshel, Marcel Zalmanovici, Jose Castanos. JIT technology with C/C++: Feedback-directed dynamic recompilation for statically compiled languages
60	--	Thejas Ramashekar, Uday Bondhugula. Automatic data allocation and buffer management for multi-GPU machines
61	--	Hans Vandierendonck, George Tzenakis, Dimitrios S. Nikolopoulos. Analysis of dependence tracking algorithms for task dataflow execution
62	--	Yeonghun Jeong, Seongseok Seo, Jongeun Lee. Evaluator-executor transformation for efficient pipelining of loops with conditionals
63	--	Rajkishore Barik, Jisheng Zhao, Vivek Sarkar. A decoupled non-SSA global register allocation using bipartite liveness graphs
64	--	Peter Gavin, David B. Whalley, Magnus Själander. Reducing instruction fetch energy in multi-issue processors

9	--	. TACO Reviewers 2012
10	--	Eran Shifer, Shlomo Weiss. Low-latency adaptive mode transitions and hierarchical power management in asymmetric clustered cores
11	--	Yosi Ben-Asher, Nadav Rotem. Hybrid type legalization for a sparse SIMD instruction set
12	--	Yuanwu Lei, Yong Dou, Lei Guo, Jinbo Xu, Jie Zhou, Yazhuo Dong, Hongjian Li. VLIW coprocessor for IEEE-754 quadruple-precision elementary functions
13	--	Motohiro Kawahito, Hideaki Komatsu, Takao Moriyama, Hiroshi Inoue, Toshio Nakatani. Idiom recognition framework using topological embedding
14	--	Ghassan Shobaki, Maxim Shawabkeh, Najm Eldeen Abu Rmaileh. Preallocation instruction scheduling with register pressure minimization using a combinatorial optimization approach
15	--	Dongrui She, Yifan He, Henk Corporaal. An energy-efficient method of supporting flexible special instructions in an embedded processor with compact ISA
16	--	V. Krishna Nandivada, Rajkishore Barik. Improved bitwidth-aware variable packing
17	--	Jung Ho Ahn, Young Hoon Son, John Kim. Scalable high-radix router microarchitecture using a network switch organization
18	--	Libo Huang, Zhiying Wang, Nong Xiao, Yongwen Wang, Qiang Dou. Adaptive communication mechanism for accelerating MPI functions in NoC-based multicore processors
19	--	Avinash Malik, David Gregg. Orchestrating stream graphs using model checking
20	--	Zheng Wang, Michael F. P. O'Boyle. Using machine learning to partition streaming programs
21	--	Ali Bakhoda, John Kim, Tor M. Aamodt. Designing on-chip networks for throughput accelerators

6	--	Angeliki Kritikakou, Francky Catthoor, George Athanasiou, Vasilios I. Kelefouras, Costas E. Goutis. Near-Optimal Microprocessor and Accelerators Codesign with Latency and Throughput Constraints
7	--	Lei Jiang, Yu Du, Bo Zhao, Youtao Zhang, Bruce R. Childers, Jun Yang. Hardware-Assisted Cooperative Integration of Wear-Leveling and Salvaging for Phase Change Memory
8	--	Kyuseung Han, Junwhan Ahn, Kiyoung Choi. Power-Efficient Predication Techniques for Acceleration of Control Flow Execution on CGRA
9	--	Chao Wang, Xi Li, Junneng Zhang, Xuehai Zhou, Xiaoning Nie. MP-Tomasulo: A Dependency-Aware Automatic Parallel Execution Engine for Sequential Programs

1	--	Yunji Chen, Tianshi Chen, Ling Li, Ruiyang Wu, Dao-Fu Liu, Weiwu Hu. Deterministic Replay Using Global Clock
2	--	Daniel Lustig, Abhishek Bhattacharjee, Margaret Martonosi. TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs
3	--	Rong Chen, Haibo Chen. Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
4	--	Michela Becchi, Patrick Crowley. A-DFA: A Time- and Space-Efficient DFA Compression Algorithm for Fast Regular Expression Evaluation
5	--	Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, Norman P. Jouppi. The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing

External Links

Journal: TACO

Volume 10, Issue 4

Volume 10, Issue 3

Volume 10, Issue 2

Volume 10, Issue 1