20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014, Orlando, FL, USA, February 15-19, 2014 - researchr publication

researchr

You are not signed in
Sign in
Sign up

20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014, Orlando, FL, USA, February 15-19, 2014. IEEE, 2014. [doi]

Conference: hpca2014

Abstract is missing.

Locality-aware data replication in the Last-Level CacheGeorge Kurian, Srinivas Devadas, Omer Khan. 1-12 [doi]

Adaptive placement and migration policy for an STT-RAM-based hybrid cacheZhe Wang, Daniel A. Jiménez, Cong Xu, Guangyu Sun, Yuan Xie. 13-24 [doi]

DASCA: Dead Write Prediction Assisted STT-RAM Cache ArchitectureJunwhan Ahn, Sungjoo Yoo, Kiyoung Choi. 25-36 [doi]

A detailed GPU cache model based on reuse distance theoryCedric Nugteren, Gert-Jan van den Braak, Henk Corporaal, Henri Bal. 37-48 [doi]

Precision-aware soft error protection for GPUsDavid J. Palframan, Nam Sung Kim, Mikko H. Lipasti. 49-59 [doi]

Understanding the impact of gate-level physical reliability effects on whole program executionRaghuraman Balasubramanian, Karthikeyan Sankaralingam. 60-71 [doi]

Accordion: Toward soft Near-Threshold Voltage ComputingUlya R. Karpuzcu, Ismail Akturk, Nam Sung Kim. 72-83 [doi]

Mosaic: Exploiting the spatial locality of process variation to reduce refresh energy in on-chip eDRAM modulesAditya Agrawal, Amin Ansari, Josep Torrellas. 84-95 [doi]

Low-overhead and high coverage run-time race detection through selective meta-data managementRuirui C. Huang, Erik Halberg, Andrew Ferraiuolo, G. Edward Suh. 96-107 [doi]

FADE: A programmable filtering accelerator for instruction-grain monitoringSotiria Fytraki, Evangelos Vlachos, Yusuf Onur Koçberber, Babak Falsafi, Boris Grot. 108-119 [doi]

Dynamically detecting and tolerating IF-Condition Data RacesShanxiang Qi, Abdullah Muzahid, Wonsun Ahn, Josep Torrellas. 120-131 [doi]

Exploiting thermal energy storage to reduce data center capital and operating expensesWenli Zheng, Kai Ma, Xiaorui Wang. 132-141 [doi]

Implications of high energy proportional servers on cluster-wide energy proportionalityDaniel Wong 0001, Murali Annavaram. 142-153 [doi]

Strategies for anticipating risk in heterogeneous system designMarisabel Guevara, Benjamin Lubin, Benjamin C. Lee. 154-164 [doi]

TSO-CC: Consistency directed cache coherence for TSOMarco Elver, Vijay Nagarajan. 165-176 [doi]

Stash directory: A scalable directory for many-core coherenceSocrates Demetriades, Sangyeun Cho. 177-188 [doi]

QuickRelease: A throughput-oriented approach to release consistency on GPUsBlake A. Hechtman, Shuai Che, Derek R. Hower, Yingying Tian, Bradford M. Beckmann, Mark D. Hill, Steven K. Reinhardt, David A. Wood. 189-200 [doi]

A Non-Inclusive Memory Permissions architecture for protection against cross-layer attacksJesse Elwell, Ryan Riley, Nael B. Abu-Ghazaleh, Dmitry Ponomarev. 201-212 [doi]

Suppressing the Oblivious RAM timing channel while making information leakage and program efficiency trade-offsChristopher W. Fletcher, Ling Ren, Xiangyao Yu, Marten van Dijk, Omer Khan, Srinivas Devadas. 213-224 [doi]

Timing channel protection for a shared memory controllerYao Wang, Andrew Ferraiuolo, G. Edward Suh. 225-236 [doi]

STM: Cloning the spatial and temporal memory access behaviorAmro Awad, Yan Solihin. 237-247 [doi]

A scalable multi-path microarchitecture for efficient GPU control flowAhmed ElTantawy, Jessica Wenjie Ma, Mike O'Connor, Tor M. Aamodt. 248-259 [doi]

Improving GPGPU resource utilization through alternative thread block schedulingMinseok Lee, Seokwoo Song, Joosik Moon, John Kim, Woong Seo, Yeon Gon Cho, Soojung Ryu. 260-271 [doi]

MRPB: Memory request prioritization for massively parallel processorsWenhao Jia, Kelly A. Shaw, Margaret Martonosi. 272-283 [doi]

Warp-level divergence in GPUs: Characterization, impact, and mitigationPing Xiang, Yi Yang, Huiyang Zhou. 284-295 [doi]

MP3: Minimizing performance penalty for power-gating of Clos network-on-chipLizhong Chen, Lihang Zhao, Ruisheng Wang, Timothy Mark Pinkston. 296-307 [doi]

Up by their bootstraps: Online learning in Artificial Neural Networks for CMP uncore power managementJae Yeon Won, Xi Chen, Paul Gratz, Jiang Hu, Vassos Soteriou. 308-319 [doi]

QORE: A fault tolerant network-on-chip architecture with power-efficient quad-function channel (QFC) buffersDominic DiTomaso, Avinash Karanth Kodi, Ahmed Louri. 320-331 [doi]

Transportation-network-inspired network-on-chipHanjoon Kim, Gwangsun Kim, Seungryoul Maeng, Hwasoo Yeo, John Kim. 332-343 [doi]

Improving system throughput and fairness simultaneously in shared memory CMP systems via Dynamic Bank PartitioningMingli Xie, Dong Tong, Kan Huang, Xu Cheng. 344-355 [doi]

Improving DRAM performance by parallelizing refreshes with accessesKevin Kai-Wei Chang, Donghyuk Lee, Zeshan Chishti, Alaa R. Alameldeen, Chris Wilkerson, Yoongu Kim, Onur Mutlu. 356-367 [doi]

CREAM: A Concurrent-Refresh-Aware DRAM Memory architectureTao Zhang, Matthew Poremba, Cong Xu, Guangyu Sun, Yuan Xie. 368-379 [doi]

DraMon: Predicting memory bandwidth usage of multi-threaded programs with high accuracy and low overheadWei Wang, Tanima Dey, Jack W. Davidson, Mary Lou Soffa. 380-391 [doi]

PVCoherence: Designing flat coherence protocols for scalable verificationMeng Zhang, Jesse D. Bingham, John Erickson, Daniel J. Sorin. 392-403 [doi]

Atomic SC for simple in-order processorsDibakar Gope, Mikko H. Lipasti. 404-415 [doi]

Concurrent and consistent virtual machine introspection with hardware transactional memoryYutao Liu, Yubin Xia, Haibing Guan, Binyu Zang, Haibo Chen. 416-427 [doi]

Practical data value speculation for future high-end processorsArthur Perais, André Seznec. 428-439 [doi]

Tangle: Route-oriented dynamic voltage minimization for variation-afflicted, energy-efficient on-chip networksAmin Ansari, Asit K. Mishra, Jianping Xu, Josep Torrellas. 440-451 [doi]

Improving cache performance using read-write partitioningSamira Manabi Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, Daniel A. Jiménez. 452-463 [doi]

NUAT: A non-uniform access time memory controllerWongyu Shin, Jeongmin Yang, Jungwhan Choi, Lee-Sup Kim. 464-475 [doi]

® Transactional Synchronization ExtensionsTomas Karnagel, Roman Dementiev, Ravi Rajwar, Konrad Lai, Thomas Legler, Benjamin Schlegel, Wolfgang Lehner. 476-487 [doi]

BigDataBench: A big data benchmark suite from internet servicesLei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, Wanling Gao, Zhen Jia, Yingjie Shi, Shujie Zhang, Chen Zheng, Gang Lu, Kent Zhan, Xiaona Li, Bizhu Qiu. 488-499 [doi]

3D stacking of high-performance processorsPhilip G. Emma, Alper Buyuktosunoglu, Michael B. Healy, Krishnan Kailas, Valentin Puente, Roy Yu, Allan Hartstein, Pradip Bose, Jaime H. Moreno. 500-511 [doi]

Reducing the cost of persistence for nonvolatile heaps in end user devicesSudarsun Kannan, Ada Gavrilovska, Karsten Schwan. 512-523 [doi]

Sprinkler: Maximizing resource utilization in many-chip solid state disksMyoungsoo Jung, Mahmut T. Kandemir. 524-535 [doi]

Over-clocked SSD: Safely running beyond flash memory chip I/O clock specsKai Zhao, Kalyana S. Venkataraman, Xuebin Zhang, Jiangpeng Li, Ning Zheng, Tong Zhang 0002. 536-545 [doi]

GPUdmm: A high-performance and memory-oblivious GPU architecture using dynamic memory managementYoungsok Kim, Jaewon Lee, Jae-Eon Jo, Jangwoo Kim. 546-557 [doi]

Increasing TLB reach by exploiting clustering in page translationsBinh Pham, Abhishek Bhattacharjee, Yasuko Eckert, Gabriel H. Loh. 558-567 [doi]

Supporting x86-64 address translation for 100s of GPU lanesJason Power, Mark D. Hill, David A. Wood. 568-578 [doi]

Scalably verifiable dynamic power managementOpeoluwa Matthews, Meng Zhang, Daniel J. Sorin. 579-590 [doi]

Revolver: Processor architecture for power efficient loop executionMitchell Hayenga, Vignyan Reddy Kothinti Naresh, Mikko H. Lipasti. 591-602 [doi]

Dynamic management of TurboMode in modern multi-core chipsDavid Lo, Christos Kozyrakis. 603-613 [doi]

Spare register aware prefetching for graph algorithms on GPUsNagesh B. Lakshminarayana, Hyesoon Kim. 614-625 [doi]

Sandbox Prefetching: Safe run-time evaluation of aggressive prefetchersSeth H. Pugsley, Zeshan Chishti, Chris Wilkerson, Peng-fei Chuang, Robert L. Scott, Aamer Jaleel, Shih-Lien Lu, Kingsum Chow, Rajeev Balasubramonian. 626-637 [doi]

MemZip: Exploring unconventional benefits from memory compressionAli Shafiee, Meysam Taassori, Rajeev Balasubramonian, Al Davis. 638-649 [doi]

CDTT: Compiler-generated data-triggered threadsHung-Wei Tseng, Dean M. Tullsen. 650-661 [doi]

Accelerating decoupled look-ahead via weak dependence removal: A metaheuristic approachRaj Parihar, Michael C. Huang. 662-677 [doi]

Undersubscribed threading on clustered cache architecturesWim Heirman, Trevor E. Carlson, Kenzo Van Craeynest, Ibrahim Hur, Aamer Jaleel, Lieven Eeckhout. 678-689 [doi]

runs on WebDSL