TACO - researchr journal

researchr

You are not signed in
Sign in
Sign up

24	--	Bart Coppens, Bjorn De Sutter, Jonas Maebe. Feedback-driven binary code diversification
25	--	Jeremy Fowers, Greg Brown, John Robert Wernsing, Greg Stitt. A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors
26	--	Erven Rohou, Kevin Williams, David Yuste. Vectorization technology to improve interpreter performance
27	--	Jimmy Cleary, Owen Callanan, Mark Purcell, David Gregg. Fast asymmetric thread synchronization
28	--	Yong Li 0009, Rami G. Melhem, Alex K. Jones. PS-TLB: Leveraging page classification information for fast, scalable and efficient translation for future CMPs
29	--	Kristof Du Bois, Stijn Eyerman, Lieven Eeckhout. Per-thread cycle accounting in multicore processors
30	--	Christian Wimmer, Michael Haupt, Michael L. Van de Vanter, Mick J. Jordan, Laurent Daynès, Doug Simon. Maxine: An approachable virtual machine for, and in, java
31	--	Malik Murtaza Khan, Protonu Basu, Gabe Rudy, Mary W. Hall, Chun Chen, Jacqueline Chame. A script-based autotuning compiler system to generate high-performance CUDA code
32	--	Kenzo Van Craeynest, Lieven Eeckhout. Understanding fundamental design choices in single-ISA heterogeneous multicore architectures
33	--	Samuel Antao, Leonel Sousa. The CRNS framework and its application to programmable and reconfigurable cryptography
34	--	Boubacar Diouf, Can Hantas, Albert Cohen, Özcan Özturk, Jens Palsberg. A decoupled local memory allocator
35	--	Huimin Cui, Qing Yi, Jingling Xue, Xiaobing Feng. Layout-oblivious compiler optimization for matrix computations
36	--	Stephen Dolan, Servesh Muralidharan, David Gregg. Compiler support for lightweight context switching
37	--	Pablo Abad, Valentin Puente, José-Ángel Gregorio. LIGERO: A light but efficient router conceived for cache-coherent chip multiprocessors
38	--	Jorge Albericio, Pablo Ibáñez, Víctor Viñals, José María Llabería. Exploiting reuse locality on inclusive shared last-level caches
39	--	Paraskevas Yiapanis, Demian Rosas-Ham, Gavin Brown, Mikel Luján. Optimizing software runtime systems for speculative parallelization
40	--	Cedric Nugteren, Pieter Custers, Henk Corporaal. Algorithmic species: A classification of affine loop nests for parallel programming
41	--	Marco Gerards, Jan Kuper. Optimal DPM and DVFS for frame-based real-time systems
42	--	Zhichao Yan, Hong Jiang, Yujuan Tan, Dan Feng. An integrated pseudo-associativity and relaxed-order approach to hardware transactional memory
43	--	Doris Chen, Deshanand P. Singh. Profile-guided floating- to fixed-point conversion for hybrid FPGA-processor applications
44	--	Yan Cui, Yingxin Wang, Yu Chen, Yuanchun Shi. Lock-contention-aware scheduler: A scalable and energy-efficient method for addressing scalability collapse on multicore systems
45	--	Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi N. Bhuyan. ADAPT: A framework for coscheduling multithreaded programs
46	--	Michele Tartara, Stefano Crespi-Reghizzi. Continuous learning of compiler heuristics
47	--	Grigorios Chrysos, Panagiotis Dagritzikos, Ioannis Papaefstathiou, Apostolos Dollas. HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system
48	--	Jongwon Lee, Yohan Ko, Kyoungwoo Lee, Jonghee M. Youn, Yunheung Paek. Dynamic code duplication with vulnerability awareness for soft error detection on VLIW architectures
49	--	Fabien Coelho, François Irigoin. API compilation for image hardware accelerators
50	--	Carlos Luque, Miquel Moretó, Francisco J. Cazorla, Mateo Valero. Fair CPU time accounting in CMP+SMT processors
51	--	Pavlos M. Mattheakis, Ioannis Papaefstathiou. Significantly reducing MPI intercommunication latency and power overhead in both embedded and HPC systems
52	--	Riyadh Baghdadi, Albert Cohen, Sven Verdoolaege, Konrad Trifunovic. Improved loop tiling based on the removal of spurious false dependences
53	--	Antoniu Pop, Albert Cohen. OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs
54	--	Sven Verdoolaege, Juan Carlos Juega, Albert Cohen, José Ignacio Gómez, Christian Tenllado, Francky Catthoor. Polyhedral parallel code generation for CUDA
55	--	Yu Du, Miao Zhou, Bruce R. Childers, Rami G. Melhem, Daniel Mossé. Delta-compressed caching for overcoming the write bandwidth limitation of hybrid main memory
56	--	Suresh Purini, Lakshya Jain. Finding good optimization sequences covering program space
57	--	Mehmet E. Belviranli, Laxmi N. Bhuyan, Rajiv Gupta. A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures
58	--	Anurag Negi, J. Rubén Titos Gil. SCIN-cache: Fast speculative versioning in multithreaded cores
59	--	Thibaut Lutz, Christian Fensch, Murray Cole. PARTANS: An autotuning framework for stencil computation on multi-GPU systems
60	--	Chunhua Xiao, M.-C. Frank Chang, Jason Cong, Michael Gill, Zhangqin Huang, Chunyue Liu, Glenn Reinman, Hao Wu. Stream arbitration: Towards efficient bandwidth utilization for emerging on-chip interconnects

13	--	Yangchun Luo, Antonia Zhai. Dynamically dispatching speculative threads to improve sequential execution
14	--	Huimin Cui, Jingling Xue, Lei Wang 0004, Yang Yang, Xiaobing Feng 0002, Dongrui Fan. Extendable pattern-oriented optimization directives
15	--	Adam Wade Lewis, Nian-Feng Tzeng, Soumik Ghosh. Runtime energy consumption estimation for server workloads based on chaotic time-series approximation
16	--	Alejandro Valero, Julio Sahuquillo, Salvador Petit, Pedro López, José Duato. Combining recency of information with selective random and a victim cache in last-level caches
17	--	Bin Li, Li-Shiuan Peh, Li Zhao, Ravi Iyer. Dynamic QoS management for chip multiprocessors
18	--	Polychronis Xekalakis, Nikolas Ioannou, Marcelo Cintra. Mixed speculative multithreaded execution models
19	--	Mageda Sharafeddine, Komal Jothi, Haitham Akkary. Disjoint out-of-order execution processor
20	--	Diego Andrade, Basilio B. Fraguela, Ramon Doallo. Static analysis of the worst-case memory performance for irregular codes with indirections
21	--	Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Olivier Temam, Chengyong Wu. Deconstructing iterative optimization
22	--	Apala Guha, Kim M. Hazelwood, Mary Lou Soffa. Memory optimization of dynamic binary translators for embedded systems
23	--	James R. Geraci, Sharon M. Sacco. A transpose-free in-place SIMD optimized FFT

7	--	Stijn Eyerman, Lieven Eeckhout. Probabilistic modeling for job symbiosis scheduling on SMT processors
8	--	Rachid Seghir, Vincent Loechner, Benoît Meister. Integer affine transformations of parametric ℤ-polytopes and applications to loop nest optimization
9	--	Yi Yang, Ping Xiang, Jingfei Kong, Mike Mantor, Huiyang Zhou. A unified optimizing compiler framework for different GPGPU architectures
10	--	Choonki Jang, Jaejin Lee, Bernhard Egger, Soojung Ryu. Automatic code overlay generation and partially redundant code fetch elimination
11	--	Zahra Abbasi, Georgios Varsamopoulos, Sandeep K. S. Gupta. TACOMA: Server and workload management in internet data centers considering cooling-computing power trade-off and energy proportionality
12	--	Andreas Lankes, Thomas Wild, Stefan Wallentowitz, Andreas Herkersdorf. Benefits of selective packet discard in networks-on-chip

1	--	Walid J. Ghandour, Haitham Akkary, Wes Masri. Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction
2	--	Jaekyu Lee, Hyesoon Kim, Richard W. Vuduc. When Prefetching Works, When It Doesn't, and Why
3	--	Bita Mazloom, Shashidhar Mysore, Mohit Tiwari, Banit Agrawal, Timothy Sherwood. Dataflow Tomography: Information Flow Tracking For Understanding and Visualizing Full Systems
4	--	Jung Ho Ahn, Norman P. Jouppi, Christos Kozyrakis, Jacob Leverich, Robert S. Schreiber. Improving System Energy Efficiency with Memory Rank Subsetting
5	--	Xuejun Yang, Li Wang 0027, Jingling Xue, Qingbo Wu. Comparability Graph Coloring for Optimizing Utilization of Software-Managed Stream Register Files for Stream Processors
6	--	Abhinandan Majumdar, Srihari Cadambi, Michela Becchi, Srimat T. Chakradhar, Hans Peter Graf. A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification

External Links

Journal: TACO

Volume 9, Issue 4

Volume 9, Issue 3

Volume 9, Issue 2

Volume 9, Issue 1