Abstract is missing.
- Large pages and lightweight memory management in virtualized environments: can you have it both ways?Binh Pham, Ján Veselý, Gabriel H. Loh, Abhishek Bhattacharjee. 1-12 [doi]
- Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systemsGuowei Zhang, Webb Horn, Daniel Sanchez. 13-25 [doi]
- CCICheck: using µhb graphs to verify the coherence-consistency interfaceYatin A. Manerkar, Daniel Lustig, Michael Pellauer, Margaret Martonosi. 26-37 [doi]
- HyComp: a hybrid cache compression method for selection of data-type-specific compression methodsAngelos Arelakis, Fredrik Dahlgren, Per Stenström. 38-49 [doi]
- Doppelgänger: a cache for approximate computingJoshua San Miguel, Jorge Albericio, Andreas Moshovos, Natalie D. Enright Jerger. 50-61 [doi]
- The application slowdown model: quantifying and controlling the impact of inter-application interference at shared caches and main memoryLavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Manabi Khan, Onur Mutlu. 62-75 [doi]
- MORC: a manycore-oriented compressed cacheTri M. Nguyen, David Wentzlaff. 76-88 [doi]
- Avoiding information leakage in the memory controller with fixed service policiesAli Shafiee, Akhila Gundu, Manjunath Shevgoor, Rajeev Balasubramonian, Mohit Tiwari. 89-101 [doi]
- Fork path: improving efficiency of ORAM by removing redundant memory accessesXian Zhang, Guangyu Sun, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, Jia Di. 102-114 [doi]
- Locking down insecure indirection with hardware-based control-data isolationWilliam Arthur, Sahil Madeka, Reetuparna Das, Todd M. Austin. 115-127 [doi]
- Authenticache: harnessing cache ECC for system authenticationAnys Bacha, Radu Teodorescu. 128-140 [doi]
- Efficiently prefetching complex address patternsManjunath Shevgoor, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H. Pugsley, Zeshan Chishti. 141-152 [doi]
- Self-contained, accurate precomputation prefetchingIslam Atta, Xin Tong, Vijayalakshmi Srinivasan, Ioana Baldini, Andreas Moshovos. 153-165 [doi]
- Confluence: unified instruction supply for scale-out serversCansu Kaynak, Boris Grot, Babak Falsafi. 166-177 [doi]
- IMP: indirect memory prefetcherXiangyao Yu, Christopher J. Hughes, Nadathur Satish, Srinivas Devadas. 178-190 [doi]
- DeSC: decoupled supply-compute communication management for heterogeneous architecturesTae Jun Ham, Juan L. Aragón, Margaret Martonosi. 191-203 [doi]
- Efficient warp execution in presence of divergence with collaborative context collectionFarzad Khorasani, Rajiv Gupta, Laxmi N. Bhuyan. 204-215 [doi]
- Control flow coalescing on a hybrid dataflow/von Neumann GPGPUDani Voitsechov, Yoav Etsion. 216-227 [doi]
- A scalable architecture for ordered parallelismMark C. Jeffrey, Suvinay Subramanian, Cong Yan, Joel S. Emer, Daniel Sanchez. 228-241 [doi]
- More is less: improving the energy efficiency of data movement via opportunistic use of sparse codesYanwei Song, Engin Ipek. 242-254 [doi]
- Improving DRAM latency with dynamic asymmetric subarrayShih-Lien Lu, Ying Chen Lin, Chia-Lin Yang. 255-266 [doi]
- Gather-scatter DRAM: in-DRAM address translation to improve the spatial locality of non-unit strided accessesVivek Seshadri, Thomas Mullins, Amirali Boroumand, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry. 267-280 [doi]
- The CRISP performance model for dynamic voltage and frequency scaling in a GPGPURajib Nath, Dean M. Tullsen. 281-293 [doi]
- Safe limits on voltage reduction efficiency in GPUs: a direct measurement approachJingwen Leng, Alper Buyuktosunoglu, Ramon Bertran, Pradip Bose, Vijay Janapa Reddi. 294-307 [doi]
- Adaptive guardband scheduling to improve system-level efficiency of the POWER7+Yazhou Zu, Charles R. Lefurgy, Jingwen Leng, Matthew Halpern, Michael S. Floyd, Vijay Janapa Reddi. 308-321 [doi]
- DynaMOS: dynamic schedule migration for heterogeneous coresShruti Padmanabha, Andrew Lukefahr, Reetuparna Das, Scott A. Mahlke. 322-333 [doi]
- Long term parking (LTP): criticality-aware resource allocation in OOO processorsAndreas Sembrant, Trevor E. Carlson, Erik Hagersten, David Black-Schaffer, Arthur Perais, André Seznec, Pierre Michaud. 334-346 [doi]
- The inner most loop iteration counter: a new dimension in branch historyAndré Seznec, Joshua San Miguel, Jorge Albericio. 347-357 [doi]
- Filtered runahead execution with a runahead bufferMilad Hashemi, Yale N. Patt. 358-369 [doi]
- Bungee jumps: accelerating indirect branches through HW/SW co-designDaniel S. McFarlin, Craig B. Zilles. 370-382 [doi]
- SAWS: synchronization aware GPGPU warp scheduling for multiple independent warp schedulersJiwei Liu, Jun Yang, Rami G. Melhem. 383-394 [doi]
- Enabling coordinated register allocation and thread-level parallelism optimization for GPUsXiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, Dongrui Fan. 395-406 [doi]
- Free launch: optimizing GPU dynamic kernel launches through thread reuseGuoyang Chen, Xipeng Shen. 407-419 [doi]
- GPU register file virtualizationHyeran Jeon, Gokul Subramanian Ravi, Nam Sung Kim, Murali Annavaram. 420-432 [doi]
- WarpPool: sharing requests with inter-warp coalescing for throughput processorsJohn Kloosterman, Jonathan Beaumont, Mick Wollman, Ankit Sethia, Ronald G. Dreslinski, Trevor N. Mudge, Scott A. Mahlke. 433-444 [doi]
- Ultra-low power render-based collision detection for CPU/GPU systemsEnrique de Lucas, Pedro Marcuello, Joan-Manuel Parcerisa, Antonio González. 445-456 [doi]
- Execution time prediction for energy-efficient hardware acceleratorsTao Chen, Alexander Rucker, G. Edward Suh. 457-469 [doi]
- Border control: sandboxing acceleratorsLena E. Olson, Jason Power, Mark D. Hill, David A. Wood. 470-481 [doi]
- Neural acceleration for GPU throughput processorsAmir Yazdanbakhsh, Jongse Park, Hardik Sharma, Pejman Lotfi-Kamran, Hadi Esmaeilzadeh. 482-493 [doi]
- Neuromorphic accelerators: a comparison between neuroscience and machine-learning approachesZidong Du, Daniel D. Ben-Dayan Rubin, Yunji Chen, Liqiang He, Tianshi Chen, Lei Zhang, Chengyong Wu, Olivier Temam. 494-507 [doi]
- Prediction-guided performance-energy trade-off for interactive applicationsDaniel Lo, Taejoon Song, G. Edward Suh. 508-520 [doi]
- Architecture-aware automatic computation offload for native applicationsGwangmu Lee, Hyunjoon Park, Seonyeong Heo, Kyung-Ah Chang, Hyogun Lee, Hanjun Kim. 521-532 [doi]
- Fast support for unstructured data processing: the unified automata processorYuanwei Fang, Tung Thanh Hoang, Michela Becchi, Andrew A. Chien. 533-545 [doi]
- Enabling interposer-based disintegration of multi-core processorsAjaykumar Kannan, Natalie D. Enright Jerger, Gabriel H. Loh. 546-558 [doi]
- DCS: a fast and scalable device-centric server architectureJaehyung Ahn, Dongup Kwon, Youngsok Kim, Mohammadamin Ajdari, Jaewon Lee, Jangwoo Kim. 559-571 [doi]
- Modeling the implications of DRAM failures and protection techniques on datacenter TCOPanagiota Nikolaou, Yiannakis Sazeides, Lorena Ndreu, Marios Kleanthous. 572-584 [doi]
- TimeTrader: exploiting latency tail to save datacenter energy for online searchBalajee Vamanan, Hamza Bin Sohail, Jahangir Hasan, T. N. Vijaykumar. 585-597 [doi]
- Rubik: fast analytical power management for latency-critical systemsHarshad Kasture, Davide B. Bartolini, Nathan Beckmann, Daniel Sanchez. 598-610 [doi]
- CLEAN-ECC: high reliability ECC for adaptive granularity memory systemSeong-Lyong Gong, Minsoo Rhu, Jungrae Kim, Jinsuk Chung, Mattan Erez. 611-622 [doi]
- vCache: architectural support for transparent and isolated virtual LLCs in virtualized environmentsDaehoon Kim, Hwanju Kim, Nam Sung Kim, Jaehyuk Huh. 623-634 [doi]
- An integrated concurrency and core-ISA architectural envelope definition, and test oracle, for IBM POWER multiprocessorsKathryn E. Gray, Gabriel Kerneis, Dominic P. Mulligan, Christopher Pulte, Susmit Sarkar, Peter Sewell. 635-646 [doi]
- Efficient GPU synchronization without scopes: saying no to complex consistency modelsMatthew D. Sinclair, Johnathan Alsop, Sarita V. Adve. 647-659 [doi]
- Efficient persist barriers for multicoresArpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas. 660-671 [doi]
- ThyNVM: enabling software-transparent crash consistency in persistent memory systemsJinglei Ren, Jishen Zhao, Samira Manabi Khan, Jongmoo Choi, Yongwei Wu, Onur Mutlu. 672-685 [doi]
- Coherence domain restriction on large scale systemsYaosheng Fu, Tri M. Nguyen, David Wentzlaff. 686-698 [doi]
- Efficiently enforcing strong memory ordering in GPUsAbhayendra Singh, Shaizeen Aga, Satish Narayanasamy. 699-712 [doi]
- Characterizing, modeling, and improving the QoE of mobile devices with low battery levelKaige Yan, Xingyao Zhang, Xin Fu. 713-724 [doi]
- Cross-architecture performance prediction (XAPP) using CPU code to predict GPU performanceNewsha Ardalani, Clint Lestourgeon, Karthikeyan Sankaralingam, Xiaojin Zhu. 725-737 [doi]
- A fast and accurate analytical technique to compute the AVF of sequential bits in a processorSteven Raasch, Arijit Biswas, Jon Stephan, Paul Racunas, Joel S. Emer. 738-749 [doi]
- Enabling portable energy efficiency with memory accelerated libraryQi Guo, Tze Meng Low, Nikolaos Alachiotis, Berkin Akin, Larry T. Pileggi, James C. Hoe, Franz Franchetti. 750-761 [doi]
- Microarchitectural implications of event-driven server-side web applicationsYuhao Zhu, Daniel Richins, Matthew Halpern, Vijay Janapa Reddi. 762-774 [doi]