Journal: TACO

Volume 8, Issue 4

18 -- 0Per Stenström, Koen De Bosschere. Introduction to the special issue on high-performance and embedded architectures and compilers
19 -- 0Jorge Albericio, Ruben Gran Tejero, Pablo Ibáñez, Víctor Viñals, José María Llabería. ABS: A low-cost adaptive controller for prefetching in a banked shared last-level cache
20 -- 0Ali Galip Bayrak, Nikola Velickovic, Paolo Ienne, Wayne Burleson. An architecture-independent instruction shuffler to protect against side-channel attacks
21 -- 0John Demme, Simha Sethumadhavan. Approximate graph clustering for program characterization
22 -- 0Mihai Pricopi, Tulika Mitra. Bahurupi: A polymorphic heterogeneous multi-core architecture
23 -- 0Jeroen V. Cleemput, Bart Coppens, Bjorn De Sutter. Compiler mitigations for time attacks on modern x86 processors
24 -- 0Jason McCandless, David Gregg. Compiler techniques to improve dynamic branch prediction for indirect jump and call instructions
25 -- 0Antonio García-Guirado, Ricardo Fernández Pascual, Alberto Ros, José M. García. DAPSCO: Distance-aware partially shared cache organization
26 -- 0Zhenjiang Wang, Chenggang Wu, Pen-Chung Yew, Jianjun Li, Di Xu. On-the-fly structure splitting for heap objects
27 -- 0Dibyendu Das, Benoît Dupont de Dinechin, Ramakrishna Upadrasta. Efficient liveness computation using merge sets and DJ-graphs
28 -- 0George Patsilaras, Niket K. Choudhary, James Tuck. Efficiently exploiting memory level parallelism on asymmetric coupled cores in the dark silicon era
29 -- 0Roman Malits, Evgeny Bolotin, Avinoam Kolodny, Avi Mendelson. Exploring the limits of GPGPU scheduling in control flow bound applications
30 -- 0Lois Orosa, Elisardo Antelo, Javier D. Bruguera. FlexSig: Implementing flexible hardware signatures
31 -- 0J. Rubén Titos Gil, Manuel E. Acacio, José M. García, Tim Harris, Adrián Cristal, Osman S. Unsal, Ibrahim Hur, Mateo Valero. Hardware transactional memory with software-defined conflicts
32 -- 0Yongjoo Kim, Jongeun Lee, Toan X. Mai, Yunheung Paek. Improving performance of nested loops on reconfigurable array processors
33 -- 0Madhura Purnaprajna, Paolo Ienne. Making wide-issue VLIW processors viable on FPGAs
34 -- 0Petar Radojkovic, Sylvain Girbal, Arnaud Grasset, Eduardo Quiñones, Sami Yehia, Francisco J. Cazorla. On the evaluation of the impact of shared resources in multithreaded COTS processors in time-critical environments
35 -- 0Leonid Domnitser, Aamer Jaleel, Jason Loew, Nael B. Abu-Ghazaleh, Dmitry Ponomarev. Non-monopolizable caches: Low-complexity mitigation of cache side channel attacks
36 -- 0Alejandro Rico, Felipe Cabarcas, Carlos Villavieja, Milan Pavlovic, Augusto Vega, Yoav Etsion, Alex Ramírez, Mateo Valero. On the simulation of large-scale architectures using multiple application abstraction levels
37 -- 0Selma Saidi, Pranav Tendulkar, Thierry Lepley, Oded Maler. Optimizing explicit data transfers for data parallel applications on the cell architecture
38 -- 0Min Feng, Changhui Lin, Rajiv Gupta. PLDS: Partitioning linked data structures for parallelism
39 -- 0Benoît Pradelle, Alain Ketterlin, Philippe Clauss. Polyhedral parallelization of binary code
40 -- 0Yaozu Dong, Yu Chen, Zhenhao Pan, Jinquan Dai, Yunhong Jiang. ReNIC: Architectural extension to SR-IOV I/O virtualization for efficient replication
41 -- 0Tom M. Bruintjes, Karel H. G. Walters, Sabih H. Gerez, Bert Molenkamp, Gerard J. M. Smit. Sabrewing: A lightweight architecture for combined floating-point and integer arithmetic
42 -- 0Mario Kicherer, Fabian Nowak, Rainer Buchty, Wolfgang Karl. Seamlessly portable applications: Managing the diversity of modern heterogeneous systems
43 -- 0Nathanael Premillieu, André Seznec. SYRANT: SYmmetric resource allocation on not-taken and taken paths
44 -- 0William Hasenplaugh, Pritpal S. Ahuja, Aamer Jaleel, Simon C. Steely Jr., Joel S. Emer. The gradient-based cache partitioning algorithm
45 -- 0Javier Lira, Timothy M. Jones, Carlos Molina, Antonio González. The migration prefetcher: Anticipating data promotion in dynamic NUCA caches
46 -- 0Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi N. Bhuyan. Thread Tranquilizer: Dynamically reducing performance variation
47 -- 0Dong-song Zhang, Deke Guo, Fang-Yuan Chen, Fei Wu, Tong Wu, Ting Cao, Shiyao Jin. TL-plane-based multi-core energy-efficient real-time scheduling algorithm for sporadic tasks
48 -- 0Michael J. Lyons, Mark Hempstead, Gu-Yeon Wei, David Brooks. The accelerator store: A shared memory framework for accelerator-based systems
49 -- 0Daniel A. Orozco, Elkin Garcia, Rishi Khan, Kelly Livingston, Guang R. Gao. Toward high-throughput algorithms on many-core architectures
50 -- 0Kevin Stock, Louis-Noël Pouchet, P. Sadayappan. Using machine learning to improve automatic vectorization
51 -- 0Kanit Therdsteerasukdi, Gyungsu Byun, Jason Cong, M. Frank Chang, Glenn Reinman. Utilizing RF-I and intelligent scheduling for better throughput/watt in a mobile GPU memory system
52 -- 0Frederick Ryckbosch, Stijn Polfliet, Lieven Eeckhout. VSim: Simulating multi-server setups at near native hardware speed
53 -- 0Miao Zhou, Yu Du, Bruce R. Childers, Rami G. Melhem, Daniel Mossé. Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems
54 -- 0Qingping Wang, Sameer Kulkarni, John Cavazos, Michael F. Spear. A transactional memory with automatic performance tuning
55 -- 0Bartosz Bogdanski, Sven-Arne Reinemo, Frank Olaf Sem-Jacobsen, Ernst Gunnar Gran. sFtree: A fully connected and deadlock-free switch-to-switch routing algorithm for fat-trees

Volume 8, Issue 3

10 -- 0Xi E. Chen, Tor M. Aamodt. Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
11 -- 0Marios Kleanthous, Yiannakis Sazeides. CATCH: A mechanism for dynamically detecting cache-content-duplication in instruction caches
12 -- 0Hans Vandierendonck, André Seznec. Managing SMT resource usage through speculative instruction window weighting
13 -- 0Po-Han Wang, Chia-Lin Yang, Yen-ming Chen, Yu-Jung Cheng. Power gating strategies on GPUs
14 -- 0Min Feng, Chen Tian, Changhui Lin, Rajiv Gupta. Dynamic access distance driven cache replacement
15 -- 0Ahmad Samih, Yan Solihin, Anil Krishna. Evaluating placement policies for managing capacity sharing in CMP architectures with private caches
16 -- 0Chang-Ching Yeh, Kuei-Chung Chang, Tien-Fu Chen, Chingwei Yeh. Maintaining performance on power gating of microprocessor functional units by using a predictive pre-wakeup strategy
17 -- 0Hyunjin Lee, Sangyeun Cho, Bruce R. Childers. DEFCAM: A design and evaluation framework for defect-tolerant cache memories

Volume 8, Issue 2

6 -- 0Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, Norman P. Jouppi. Hybrid checkpointing using emerging nonvolatile memories for future exascale systems
7 -- 0Jianjun Li, Chenggang Wu, Wei-Chung Hsu. Efficient and effective misaligned data access handling in a dynamic binary translation system
8 -- 0Guru Venkataramani, Christopher J. Hughes, Sanjeev Kumar, Milos Prvulovic. DeFT: Design space exploration for on-the-fly detection of coherence misses
9 -- 0Jason Hiser, Daniel W. Williams, Wei Hu, Jack W. Davidson, Jason Mars, Bruce R. Childers. Evaluating indirect branch handling mechanisms in software dynamic translation systems

Volume 8, Issue 1

1 -- 0Stijn Eyerman, Lieven Eeckhout. Fine-grained DVFS using on-chip regulators
2 -- 0Chen-Yong Cher, Eren Kursun. Exploring the effects of on-chip thermal variation on high-performance multicore architectures
3 -- 0Carole-Jean Wu, Margaret Martonosi. Adaptive timekeeping replacement: Fine-grained capacity management for shared CMP caches
4 -- 0Lucas Vespa, Ning Weng. Deterministic finite automata characterization and optimization for scalable pattern matching
5 -- 0Abhishek Bhattacharjee, Gilberto Contreras, Margaret Martonosi. Parallelization libraries: Characterizing and reducing overheads