Abstract is missing.
- KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelismIzzat El Hajj, Juan Gómez-Luna, Cheng Li, Li-Wen Chang, Dejan S. Milojicic, Wen-mei Hwu. 1-12 [doi]
- Chainsaw: Von-neumann accelerators to leverage fused instruction chainsAmirali Sharifian, Snehasish Kumar, Apala Guha, Arrvindh Shriraman. 1-14 [doi]
- Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUsNaifeng Jing, Jianfei Wang, Fengfeng Fan, Wenkang Yu, Li Jiang, Chao Li, Xiaoyao Liang. 1-12 [doi]
- Spectral profiling: Observer-effect-free profiling by monitoring EM emanationsNader Sehatbakhsh, Alireza Nazari, Alenka G. Zajic, Milos Prvulovic. 1-11 [doi]
- Exploiting semantic commutativity in hardware speculationGuowei Zhang, Virginia Chiu, Daniel Sanchez. 1-12 [doi]
- An ultra low-power hardware accelerator for automatic speech recognitionReza Yazdani, Albert Segura, Jose-Maria Arnau, Antonio Gonzalez. 1-12 [doi]
- Fused-layer CNN acceleratorsManoj Alwani, Han Chen, Michael Ferdman, Peter Milder. 1-12 [doi]
- A cloud-scale acceleration architectureAdrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, Doug Burger. 1-13 [doi]
- Co-designing accelerators and SoC interfaces using gem5-AladdinYakun Sophia Shao, Sam Likun Xi, Vijayalakshmi Srinivasan, Gu-Yeon Wei, David M. Brooks. 1-12 [doi]
- OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architecturesJia Zhan, Onur Kayiran, Gabriel H. Loh, Chita R. Das, Yuan Xie. 1-13 [doi]
- Towards efficient server architecture for virtualized network function deployment: Implications and implementationsYang Hu, Tao Li. 1-12 [doi]
- Keynotes: Internet of Things: History and hype, technology and policyMargaret Martonosi. 1-2 [doi]
- Cambricon-X: An accelerator for sparse neural networksShijin Zhang, Zidong Du, Lei Zhang, Huiying Lan, Shaoli Liu, Ling Li, Qi Guo, Tianshi Chen, Yunji Chen. 1-12 [doi]
- Low-cost soft error resilience with unified data verification and fine-grained recovery for acoustic sensor based detectionQingrui Liu, Changhee Jung, Dongyoon Lee, Devesh Tiwari. 1-12 [doi]
- Efficient data supply for hardware accelerators with prefetching and access/execute decouplingTao Chen, G. Edward Suh. 1-12 [doi]
- Dictionary sharing: An efficient cache compression scheme for compressed cachesBiswabandan Panda, André Seznec. 1-12 [doi]
- Ti-states: Processor power management in the temperature inversion regionYazhou Zu, Wei Huang, Indrani Paul, Vijay Janapa Reddi. 1-13 [doi]
- Quantifying and improving the efficiency of hardware-based mobile malware detectorsMikhail Kazdagli, Vijay Janapa Reddi, Mohit Tiwari. 1-13 [doi]
- Data-centric execution of speculative parallel programsMark C. Jeffrey, Suvinay Subramanian, Maleen Abeydeera, Joel S. Emer, Daniel Sanchez. 1-13 [doi]
- Dynamic error mitigation in NoCs using intelligent prediction techniquesDominic DiTomaso, Travis Boraten, Avinash Kodi, Ahmed Louri. 1-12 [doi]
- GRAPE: Minimizing energy for GPU applications with performance requirementsMuhammad Husni Santriaji, Henry Hoffmann. 1-13 [doi]
- CrystalBall: Statically analyzing runtime behavior via deep sequence learningStephen Zekany, Daniel Rings, Nathan Harada, Michael A. Laurenzano, Lingjia Tang, Jason Mars. 1-12 [doi]
- NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraintsYu Ji, Youhui Zhang, Shuangchen Li, Ping Chi, Cihang Jiang, Peng Qu, Yuan Xie, Wenguang Chen. 1-13 [doi]
- Concise loads and stores: The case for an asymmetric compute-memory architecture for approximationAnimesh Jain, Parker Hill, Shih-Chieh Lin, Muneeb Khan, Md E. Haque, Michael A. Laurenzano, Scott A. Mahlke, Lingjia Tang, Jason Mars. 1-13 [doi]
- ReplayConfusion: Detecting cache-based covert channel attacks using record and replayMengjia Yan, Yasser Shalabi, Josep Torrellas. 1-14 [doi]
- CANDY: Enabling coherent DRAM caches for multi-node systemsChia-Chen Chou, Aamer Jaleel, Moinuddin K. Qureshi. 1-13 [doi]
- Chameleon: Versatile and practical near-DRAM acceleration architecture for large memory systemsHadi Asghari Moghaddam, Young Hoon Son, Jung Ho Ahn, Nam Sung Kim. 1-13 [doi]
- MIMD synchronization on SIMT architecturesAhmed ElTantawy, Tor M. Aamodt. 1-14 [doi]
- Bridging the I/O performance gap for big data workloads: A new NVDIMM-based approachRenhai Chen, Zili Shao, Tao Li. 1-12 [doi]
- Approxilyzer: Towards a systematic framework for instruction-level approximate computing and its application to hardware resiliencyRadha Venkatagiri, Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Sarita V. Adve. 1-14 [doi]
- Lazy release consistency for GPUsJohnathan Alsop, Marc S. Orr, Bradford M. Beckmann, David A. Wood. 1-14 [doi]
- Zorua: A holistic approach to resource virtualization in GPUsNandita Vijaykumar, Kevin Hsieh, Gennady PekhimenW, Samira Manabi Khan, Ashish Shrestha, Saugata Ghose, Adwait Jog, Phillip B. Gibbons, Onur Mutlu. 1-14 [doi]
- NeSC: Self-virtualizing nested storage controllerYonatan Gottesman, Yoav Etsion. 1-12 [doi]
- HARE: Hardware accelerator for regular expressionsVaibhav Gogte, Aasheesh Kolli, Michael J. Cafarella, Loris D'Antoni, Thomas F. Wenisch. 1-12 [doi]
- vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network designMinsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler. 1-13 [doi]
- Continuous runahead: Transparent hardware acceleration for memory intensive workloadsMilad Hashemi, Onur Mutlu, Yale N. Patt. 1-12 [doi]
- Reducing data movement energy via online data clustering and encodingShibo Wang, Engin Ipek. 1-13 [doi]
- Continuous shape shifting: Enabling loop co-optimization via near-free dynamic code rewritingAnimesh Jain, Michael A. Laurenzano, Lingjia Tang, Jason Mars. 1-12 [doi]
- Improving energy efficiency of DRAM by exploiting half page row accessHeonjae Ha, Ardavan Pedram, Stephen Richardson, Shahar Kvatinsky, Mark Horowitz. 1-12 [doi]
- Evaluating programmable architectures for imaging and vision applicationsArtem Vasilyev, Nikhil Bhagdikar, Ardavan Pedram, Stephen Richardson, Shahar Kvatinsky, Mark Horowitz. 1-13 [doi]
- Path confidence based lookahead prefetchingJinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, Zeshan Chishti. 1-12 [doi]
- Perceptron learning for reuse predictionElvira Teran, Zhe Wang, Daniel A. Jiménez. 1-12 [doi]
- Graphicionado: A high-performance and energy-efficient accelerator for graph analyticsTae Jun Ham, Lisa Wu, Narayanan Sundaram, Nadathur Satish, Margaret Martonosi. 1-13 [doi]
- Racer: TSO consistency via race detectionAlberto Ros, Stefanos Kaxiras. 1-13 [doi]
- Contention-based congestion management in large-scale networksGwangsun Kim, Changhyun Kim, Jiyun Jeong, Mike Parker, John Kim. 1-13 [doi]
- PoisonIvy: Safe speculation for secure memoryTamara Silbergleit Lehman, Andrew D. Hilton, Benjamin C. Lee. 1-13 [doi]
- 3D: Mitigating the NUMA bottleneck via coherent DRAM cachesCheng-Chieh Huang, Rakesh Kumar 0003, Marco Elver, Boris Grot, Vijay Nagarajan. 1-12 [doi]
- pTask: A smart prefetching scheme for OS intensive applicationsPrathmesh Kallurkar, Smruti R. Sarangi. 1-12 [doi]
- Register sharing for equality predictionArthur Perais, Fernando A. Endo, André Seznec. 1-12 [doi]
- Delegated persist orderingAasheesh Kolli, Jeff Rosen, Stephan Diestelhorst, Ali G. Saidi, Steven Pelley, Sihang Liu, Peter M. Chen, Thomas F. Wenisch. 1-13 [doi]
- Snatch: Opportunistically reassigning power allocation between processor and memory in 3D stacksDimitrios Skarlatos, Renji Thomas, Aditya Agrawal, Shibin Qin, Robert C. N. Pilawa-Podgurski, Ulya R. Karpuzcu, Radu Teodorescu, Nam Sung Kim, Josep Torrellas. 1-12 [doi]
- Efficient kernel synthesis for performance portable programmingLi-Wen Chang, Izzat El Hajj, Christopher I. Rodrigues, Juan Gómez-Luna, Wen-mei Hwu. 1-13 [doi]
- Improving bank-level parallelism for irregular applicationsXulong Tang, Mahmut T. Kandemir, Praveen Yedlapalli, Jagadish Kotra. 1-12 [doi]
- From high-level deep neural models to FPGAsHardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, Hadi Esmaeilzadeh. 1-12 [doi]
- Jump over ASLR: Attacking branch predictors to bypass ASLRDmitry Evtyushkin, Dmitry V. Ponomarev, Nael B. Abu-Ghazaleh. 1-13 [doi]
- A unified memory network architecture for in-memory computing in commodity serversJia Zhan, Itir Akgun, Jishen Zhao, Al Davis, Paolo Faraboschi, Yuangang Wang, Yuan Xie. 1-14 [doi]
- Stripes: Bit-serial deep neural network computingPatrick Judd, Jorge Albericio, Tayler H. Hetherington, Tor M. Aamodt, Andreas Moshovos. 1-12 [doi]
- Redefining QoS and customizing the power management policy to satisfy individual mobile usersKaige Yan, Xingyao Zhang, Jingweijia Tan, Xin Fu. 1-12 [doi]
- A patch memory system for image processing and computer visionJason Clemons, Chih-Chi Cheng, Iuri Frosio, Daniel R. Johnson, Stephen W. Keckler. 1-13 [doi]
- SABRes: Atomic object reads for in-memory rack-scale computingAlexandros Daglis, Dmitrii Ustiugov, Stanko Novakovic, Edouard Bugnion, Babak Falsafi, Boris Grot. 1-13 [doi]
- The Bunker Cache for spatio-value approximationJoshua San Miguel, Jorge Albericio, Natalie D. Enright Jerger, Aamer Jaleel. 1-12 [doi]
- The microarchitecture of a real-time robot motion planning acceleratorSean Murray, William Floyd-Jones, Ying Qi, George Konidaris, Daniel J. Sorin. 1-12 [doi]