Journal: TACO

Volume 9, Issue 4

24 -- 0Bart Coppens, Bjorn De Sutter, Jonas Maebe. Feedback-driven binary code diversification
25 -- 0Jeremy Fowers, Greg Brown, John Robert Wernsing, Greg Stitt. A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors
26 -- 0Erven Rohou, Kevin Williams, David Yuste. Vectorization technology to improve interpreter performance
27 -- 0Jimmy Cleary, Owen Callanan, Mark Purcell, David Gregg. Fast asymmetric thread synchronization
28 -- 0Yong Li 0009, Rami G. Melhem, Alex K. Jones. PS-TLB: Leveraging page classification information for fast, scalable and efficient translation for future CMPs
29 -- 0Kristof Du Bois, Stijn Eyerman, Lieven Eeckhout. Per-thread cycle accounting in multicore processors
30 -- 0Christian Wimmer, Michael Haupt, Michael L. Van de Vanter, Mick J. Jordan, Laurent Daynès, Doug Simon. Maxine: An approachable virtual machine for, and in, java
31 -- 0Malik Murtaza Khan, Protonu Basu, Gabe Rudy, Mary W. Hall, Chun Chen, Jacqueline Chame. A script-based autotuning compiler system to generate high-performance CUDA code
32 -- 0Kenzo Van Craeynest, Lieven Eeckhout. Understanding fundamental design choices in single-ISA heterogeneous multicore architectures
33 -- 0Samuel Antao, Leonel Sousa. The CRNS framework and its application to programmable and reconfigurable cryptography
34 -- 0Boubacar Diouf, Can Hantas, Albert Cohen, Özcan Özturk, Jens Palsberg. A decoupled local memory allocator
35 -- 0Huimin Cui, Qing Yi, Jingling Xue, Xiaobing Feng. Layout-oblivious compiler optimization for matrix computations
36 -- 0Stephen Dolan, Servesh Muralidharan, David Gregg. Compiler support for lightweight context switching
37 -- 0Pablo Abad, Valentin Puente, José-Ángel Gregorio. LIGERO: A light but efficient router conceived for cache-coherent chip multiprocessors
38 -- 0Jorge Albericio, Pablo Ibáñez, Víctor Viñals, José María Llabería. Exploiting reuse locality on inclusive shared last-level caches
39 -- 0Paraskevas Yiapanis, Demian Rosas-Ham, Gavin Brown, Mikel Luján. Optimizing software runtime systems for speculative parallelization
40 -- 0Cedric Nugteren, Pieter Custers, Henk Corporaal. Algorithmic species: A classification of affine loop nests for parallel programming
41 -- 0Marco Gerards, Jan Kuper. Optimal DPM and DVFS for frame-based real-time systems
42 -- 0Zhichao Yan, Hong Jiang, Yujuan Tan, Dan Feng. An integrated pseudo-associativity and relaxed-order approach to hardware transactional memory
43 -- 0Doris Chen, Deshanand P. Singh. Profile-guided floating- to fixed-point conversion for hybrid FPGA-processor applications
44 -- 0Yan Cui, Yingxin Wang, Yu Chen, Yuanchun Shi. Lock-contention-aware scheduler: A scalable and energy-efficient method for addressing scalability collapse on multicore systems
45 -- 0Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi N. Bhuyan. ADAPT: A framework for coscheduling multithreaded programs
46 -- 0Michele Tartara, Stefano Crespi-Reghizzi. Continuous learning of compiler heuristics
47 -- 0Grigorios Chrysos, Panagiotis Dagritzikos, Ioannis Papaefstathiou, Apostolos Dollas. HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system
48 -- 0Jongwon Lee, Yohan Ko, Kyoungwoo Lee, Jonghee M. Youn, Yunheung Paek. Dynamic code duplication with vulnerability awareness for soft error detection on VLIW architectures
49 -- 0Fabien Coelho, François Irigoin. API compilation for image hardware accelerators
50 -- 0Carlos Luque, Miquel Moretó, Francisco J. Cazorla, Mateo Valero. Fair CPU time accounting in CMP+SMT processors
51 -- 0Pavlos M. Mattheakis, Ioannis Papaefstathiou. Significantly reducing MPI intercommunication latency and power overhead in both embedded and HPC systems
52 -- 0Riyadh Baghdadi, Albert Cohen, Sven Verdoolaege, Konrad Trifunovic. Improved loop tiling based on the removal of spurious false dependences
53 -- 0Antoniu Pop, Albert Cohen. OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs
54 -- 0Sven Verdoolaege, Juan Carlos Juega, Albert Cohen, José Ignacio Gómez, Christian Tenllado, Francky Catthoor. Polyhedral parallel code generation for CUDA
55 -- 0Yu Du, Miao Zhou, Bruce R. Childers, Rami G. Melhem, Daniel Mossé. Delta-compressed caching for overcoming the write bandwidth limitation of hybrid main memory
56 -- 0Suresh Purini, Lakshya Jain. Finding good optimization sequences covering program space
57 -- 0Mehmet E. Belviranli, Laxmi N. Bhuyan, Rajiv Gupta. A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures
58 -- 0Anurag Negi, J. Rubén Titos Gil. SCIN-cache: Fast speculative versioning in multithreaded cores
59 -- 0Thibaut Lutz, Christian Fensch, Murray Cole. PARTANS: An autotuning framework for stencil computation on multi-GPU systems
60 -- 0Chunhua Xiao, M.-C. Frank Chang, Jason Cong, Michael Gill, Zhangqin Huang, Chunyue Liu, Glenn Reinman, Hao Wu. Stream arbitration: Towards efficient bandwidth utilization for emerging on-chip interconnects

Volume 9, Issue 3

13 -- 0Yangchun Luo, Antonia Zhai. Dynamically dispatching speculative threads to improve sequential execution
14 -- 0Huimin Cui, Jingling Xue, Lei Wang 0004, Yang Yang, Xiaobing Feng 0002, Dongrui Fan. Extendable pattern-oriented optimization directives
15 -- 0Adam Wade Lewis, Nian-Feng Tzeng, Soumik Ghosh. Runtime energy consumption estimation for server workloads based on chaotic time-series approximation
16 -- 0Alejandro Valero, Julio Sahuquillo, Salvador Petit, Pedro López, José Duato. Combining recency of information with selective random and a victim cache in last-level caches
17 -- 0Bin Li, Li-Shiuan Peh, Li Zhao, Ravi Iyer. Dynamic QoS management for chip multiprocessors
18 -- 0Polychronis Xekalakis, Nikolas Ioannou, Marcelo Cintra. Mixed speculative multithreaded execution models
19 -- 0Mageda Sharafeddine, Komal Jothi, Haitham Akkary. Disjoint out-of-order execution processor
20 -- 0Diego Andrade, Basilio B. Fraguela, Ramon Doallo. Static analysis of the worst-case memory performance for irregular codes with indirections
21 -- 0Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Olivier Temam, Chengyong Wu. Deconstructing iterative optimization
22 -- 0Apala Guha, Kim M. Hazelwood, Mary Lou Soffa. Memory optimization of dynamic binary translators for embedded systems
23 -- 0James R. Geraci, Sharon M. Sacco. A transpose-free in-place SIMD optimized FFT

Volume 9, Issue 2

7 -- 0Stijn Eyerman, Lieven Eeckhout. Probabilistic modeling for job symbiosis scheduling on SMT processors
8 -- 0Rachid Seghir, Vincent Loechner, Benoît Meister. Integer affine transformations of parametric ℤ-polytopes and applications to loop nest optimization
9 -- 0Yi Yang, Ping Xiang, Jingfei Kong, Mike Mantor, Huiyang Zhou. A unified optimizing compiler framework for different GPGPU architectures
10 -- 0Choonki Jang, Jaejin Lee, Bernhard Egger, Soojung Ryu. Automatic code overlay generation and partially redundant code fetch elimination
11 -- 0Zahra Abbasi, Georgios Varsamopoulos, Sandeep K. S. Gupta. TACOMA: Server and workload management in internet data centers considering cooling-computing power trade-off and energy proportionality
12 -- 0Andreas Lankes, Thomas Wild, Stefan Wallentowitz, Andreas Herkersdorf. Benefits of selective packet discard in networks-on-chip

Volume 9, Issue 1

1 -- 0Walid J. Ghandour, Haitham Akkary, Wes Masri. Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction
2 -- 0Jaekyu Lee, Hyesoon Kim, Richard W. Vuduc. When Prefetching Works, When It Doesn't, and Why
3 -- 0Bita Mazloom, Shashidhar Mysore, Mohit Tiwari, Banit Agrawal, Timothy Sherwood. Dataflow Tomography: Information Flow Tracking For Understanding and Visualizing Full Systems
4 -- 0Jung Ho Ahn, Norman P. Jouppi, Christos Kozyrakis, Jacob Leverich, Robert S. Schreiber. Improving System Energy Efficiency with Memory Rank Subsetting
5 -- 0Xuejun Yang, Li Wang 0027, Jingling Xue, Qingbo Wu. Comparability Graph Coloring for Optimizing Utilization of Software-Managed Stream Register Files for Stream Processors
6 -- 0Abhinandan Majumdar, Srihari Cadambi, Michela Becchi, Srimat T. Chakradhar, Hans Peter Graf. A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification