Abstract is missing.
- Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network ComputingJorge Albericio, Patrick Judd, Tayler H. Hetherington, Tor M. Aamodt, Natalie D. Enright Jerger, Andreas Moshovos. 1-13 [doi]
- ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in CrossbarsAli Shafiee, Anirban Nag, Naveen Muralimanohar, Rajeev Balasubramonian, John Paul Strachan, Miao Hu, R. Stanley Williams, Vivek Srikumar. 14-26 [doi]
- PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main MemoryPing Chi, Shuangchen Li, Cong Xu, Tao Zhang, Jishen Zhao, Yongpan Liu, Yu Wang, Yuan Xie. 27-39 [doi]
- Asymmetry-Aware Work-Stealing RuntimesChristopher Torng, Moyang Wang, Christopher Batten. 40-52 [doi]
- Morpheus: Creating Application Objects Efficiently for Heterogeneous ComputingHung-Wei Tseng, Qianchen Zhao, Yuxiao Zhou, Mark Gahagan, Steven Swanson. 53-65 [doi]
- Towards Statistical Guarantees in Controlling Quality Tradeoffs for Approximate AccelerationDivya Mahajan, Amir Yazdanbakhsh, Jongse Park, Bradley Thwaites, Hadi Esmaeilzadeh. 66-77 [doi]
- Back to the Future: Leveraging Belady's Algorithm for Improved Cache ReplacementAkanksha Jain, Calvin Lin. 78-89 [doi]
- Efficient Synonym Filtering and Scalable Delayed Translation for Hybrid Virtual CachingChang-Hyun Park, Taekyung Heo, Jaehyuk Huh. 90-102 [doi]
- LAP: Loop-Block Aware Inclusion Properties for Energy-Efficient Asymmetric Last Level CachesHsiang-Yun Cheng, Jishen Zhao, Jack Sampson, Mary Jane Irwin, Aamer Jaleel, Yu Lu, Yuan Xie 0001. 103-114 [doi]
- Automatic Generation of Efficient Accelerators for Reconfigurable HardwareDavid Koeplinger, Raghu Prabhakar, Yaqi Zhang, Christina Delimitrou, Christos Kozyrakis, Kunle Olukotun. 115-127 [doi]
- Strober: Fast and Accurate Sample-Based Energy Simulation for Arbitrary RTLDonggyu Kim, Adam M. Izraelevitz, Christopher Celio, Hokeun Kim, Brian Zimmer, Yunsup Lee, Jonathan Bachrach, Krste Asanovic. 128-139 [doi]
- PowerChop: Identifying and Managing Non-critical Units in Hybrid Processor ArchitecturesMichael A. Laurenzano, Yunqi Zhang, Jiang Chen, Lingjia Tang, Jason Mars. 140-152 [doi]
- Biscuit: A Framework for Near-Data Processing of Big Data WorkloadsBoncheol Gu, Andre S. Yoon, Duck-Ho Bae, Insoon Jo, Jinyoung Lee, Jonghyun Yoon, Jeong-Uk Kang, MoonSang Kwon, Chanho Yoon, Sangyeun Cho, Jaeheon Jeong, Duckhyun Chang. 153-165 [doi]
- Energy Efficient Architecture for Graph Analytics AcceleratorsMuhammet Mustafa Ozdal, Serif Yesil, Taemin Kim, Andrey Ayupov, John Greth, Steven M. Burns, Özcan Özturk. 166-177 [doi]
- ASIC Clouds: Specializing the DatacenterIkuo Magaki, Moein Khazraee, Luis Vega Gutierrez, Michael Bedford Taylor. 178-190 [doi]
- APRES: Improving Cache Efficiency by Exploiting Load Characteristics on GPUsYunho Oh, Keunsoo Kim, Myung Kuk Yoon, Jong Hyun Park, Yongjun Park, Won Woo Ro, Murali Annavaram. 191-203 [doi]
- Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU SystemsKevin Hsieh, Eiman Ebrahimi, Gwangsun Kim, Niladrish Chatterjee, Mike O'Connor, Nandita Vijaykumar, Onur Mutlu, Stephen W. Keckler. 204-216 [doi]
- Efficient Intra-SM Slicing through Dynamic Resource Partitioning for GPU MultiprogrammingChang-Hyun Park, Taekyung Heo, Jaehyuk Huh. 217-229 [doi]
- Warped-Slicer: Efficient Intra-SM Slicing through Dynamic Resource Partitioning for GPU MultiprogrammingQiumin Xu, Hyeran Jeon, Keunsoo Kim, Won Woo Ro, Murali Annavaram. 230-242 [doi]
- EIE: Efficient Inference Engine on Compressed Deep Neural NetworkSong Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dally. 243-254 [doi]
- RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile VisionRobert LiKamWa, Yunhui Hou, Yuan Gao, Mia Polansky, Lin Zhong. 255-266 [doi]
- Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network AcceleratorsBrandon Reagen, Paul N. Whatmough, Robert Adolf, Saketh Rama, Hyunkwang Lee, Sae Kyu Lee, José Miguel Hernández-Lobato, Gu-Yeon Wei, David M. Brooks. 267-278 [doi]
- Opportunistic Competition Overhead Reduction for Expediting Critical Section in NoC Based CMPsYuan Yao, Zhonghai Lu. 279-290 [doi]
- Short-Circuit Dispatch: Accelerating Virtual Machine Interpreters on Embedded ProcessorsChannoh Kim, Sungmin Kim, Hyeon-Gyu Cho, Doo-young Kim, Jaehyeok Kim, Young H. Oh, Hakbeom Jang, Jae W. Lee. 291-303 [doi]
- ARM Virtualization: Performance and Architectural ImplicationsChristoffer Dall, Shih-wei Li, Jin Tack Lim, Jason Nieh, Georgios Koloventzos. 304-316 [doi]
- Base-Victim Compression: An Opportunistic Cache Compression ArchitectureJayesh Gaur, Alaa R. Alameldeen, Sreenivas Subramoney. 317-328 [doi]
- Bit-Plane Compression: Transforming Data for Better Compression in Many-Core ArchitecturesJungrae Kim, Michael Sullivan, Esha Choukse, Mattan Erez. 329-340 [doi]
- XED: Exposing On-Die Error Detection Information for Strong Memory ReliabilityPrashant J. Nair, Vilas Sridharan, Moinuddin K. Qureshi. 341-353 [doi]
- Production-Run Software Failure Diagnosis via Adaptive Communication TrackingMohammad Mejbah Ul Alam, Abdullah Muzahid. 354-366 [doi]
- Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural NetworksYu-Hsin Chen, Joel S. Emer, Vivienne Sze. 367-379 [doi]
- Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D MemoryDuckhwan Kim, Jaeha Kung, Sek M. Chai, Sudhakar Yalamanchili, Saibal Mukhopadhyay. 380-392 [doi]
- Cambricon: An Instruction Set Architecture for Neural NetworksShaoli Liu, Zidong Du, Jinhua Tao, Dong Han, Tao Luo, Yuan Xie, Yunji Chen, Tianshi Chen. 393-405 [doi]
- Decoupling Loads for Nano-Instruction Set ComputersZiqiang Huang, Andrew D. Hilton, Benjamin C. Lee. 406-417 [doi]
- Future Vector Microprocessor Extensions for Data AggregationsTimothy Hayes 0001, Oscar Palomar, Osman S. Unsal, Adrián Cristal, Mateo Valero. 418-430 [doi]
- Efficiently Scaling Out-of-Order Cores for Simultaneous MultithreadingFaissal M. Sleiman, Thomas F. Wenisch. 431-443 [doi]
- Accelerating Dependent Cache Misses with an Enhanced Memory ControllerMilad Hashemi, Khubaib, Eiman Ebrahimi, Onur Mutlu, Yale N. Patt. 444-455 [doi]
- Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical InferenceYunqi Zhang, David Meisner, Jason Mars, Lingjia Tang. 456-468 [doi]
- Dynamo: Facebook's Data Center-Wide Power Management SystemQiang Wu, Qingyuan Deng, Lakshmi Ganesh, Chang-Hong Hsu, Yun Jin, Sanjeev Kumar, Bin Li, Justin Meza, Yee Jiun Song. 469-480 [doi]
- Peak Efficiency Aware Scheduling for Highly Energy Proportional ServersDaniel Wong. 481-492 [doi]
- Power Attack Defense: Securing Battery-Backed Data CentersChao Li, Zhenhua Wang, Xiaofeng Hou, Haopeng Chen, Xiaoyao Liang, Minyi Guo. 493-505 [doi]
- DRAF: A Low-Power DRAM-Based Reconfigurable Acceleration FabricMingyu Gao, Christina Delimitrou, Dimin Niu, Krishna T. Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis. 506-518 [doi]
- Mellow Writes: Extending Lifetime in Resistive Memories through Selective Slow Write BacksLunkai Zhang, Brian Neely, Diana Franklin, Dmitri B. Strukov, Yuan Xie, Frederic T. Chong. 519-531 [doi]
- MITTS: Memory Inter-arrival Time Traffic ShapingYanqi Zhou, David Wentzlaff. 532-544 [doi]
- The Anytime AutomatonJoshua San Miguel, Natalie D. Enright Jerger. 545-557 [doi]
- Accelerating Markov Random Field Inference Using Molecular Optical Gibbs Sampling UnitsSiyang Wang, Xiangyu Zhang, Yuxuan Li, Ramin Bashizade, Song Yang, Chris Dwyer, Alvin R. Lebeck. 558-569 [doi]
- Evaluation of an Analog Accelerator for Linear AlgebraYipeng Huang, Ning Guo, Mingoo Seok, Yannis P. Tsividis, Simha Sethumadhavan. 570-582 [doi]
- LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUsJin Wang, Norm Rubin, Albert Sidelnik, Sudhakar Yalamanchili. 583-595 [doi]
- ActivePointers: A Case for Software Address Translation on GPUsSagi Shahar, Shai Bergman, Mark Silberstein. 596-608 [doi]
- Virtual Thread: Maximizing Thread-Level Parallelism beyond GPU Scheduling LimitMyung Kuk Yoon, Keunsoo Kim, Sangpil Lee, Won Woo Ro, Murali Annavaram. 609-621 [doi]
- All-Inclusive ECC: Thorough End-to-End Protection for Reliable Computer MemoryJungrae Kim, Michael Sullivan, Sangkug Lym, Mattan Erez. 622-633 [doi]
- Rescuing Uncorrectable Fault Patterns in On-Chip Memories through Error Pattern TransformationHenry Duwe, Xun Jian, Daniel Petrisko, Rakesh Kumar 0002. 634-644 [doi]
- RelaxFault Memory RepairDong-Wan Kim, Mattan Erez. 645-657 [doi]
- Using Multiple Input, Multiple Output Formal Control to Maximize Resource Efficiency in ArchitecturesRaghavendra Pradyumna Pothukuchi, Amin Ansari, Petros G. Voulgaris, Josep Torrellas. 658-670 [doi]
- Exploiting Dynamic Timing Slack for Energy Efficiency in Ultra-Low-Power Embedded SystemsHari Cherupalli, Rakesh Kumar 0002, John Sartori. 671-681 [doi]
- CASH: Supporting IaaS Customers with a Sub-core Configurable ArchitectureYanqi Zhou, Henry Hoffmann, David Wentzlaff. 682-694 [doi]
- Boosting Access Parallelism to PCM-Based Main MemoryMohammad Arjomand, Mahmut T. Kandemir, Anand Sivasubramaniam, Chita R. Das. 695-706 [doi]
- Agile Paging: Exceeding the Best of Nested and Shadow PagingJayneel Gandhi, Mark D. Hill, Michael M. Swift. 707-718 [doi]
- Energy Efficient Data Encoding in DRAM Channels Exploiting Data Value SimilarityHoseok Seol, Wongyu Shin, Jaemin Jang, Jungwhan Choi, Jinwoong Suh, Lee-Sup Kim. 719-730 [doi]