Abstract is missing.
- MASR: A Modular Accelerator for Sparse RNNsUdit Gupta, Brandon Reagen, Lillian Pentecost, Marco Donato, Thierry Tambe, Alexander M. Rush, Gu-Yeon Wei, David Brooks 0001. 1-14 [doi]
- Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph AnalyticsRoshan Dathathri, Gurbinder Gill, Loc Hoang, Vishwesh Jatala, Keshav Pingali, V. Krishna Nandivada, Hoang-Vu Dang, Marc Snir. 15-28 [doi]
- BOLT: Optimizing OpenMP Parallel Regions with User-Level ThreadsShintaro Iwasaki, Abdelhalim Amer, Kenjiro Taura, Sangmin Seo, Pavan Balaji. 29-42 [doi]
- SMT-COP: Defeating Side-Channel Attacks on Execution Units in SMT ProcessorsDaniel Townley, Dmitry Ponomarev. 43-54 [doi]
- Type-Directed Program Synthesis and Constraint Generation for Library PortabilityBruce Collie, Philip Ginsbach, Michael F. P. O'Boyle. 55-67 [doi]
- Deepframe: A Profile-Driven Compiler for Spatial Hardware AcceleratorsApala Guha, Naveen Vedula, Arrvindh Shriraman. 68-81 [doi]
- Fast Parallel Equivalence Relations in a Datalog CompilerPatrick Nappa, David Zhao, Pavle Subotic, Bernhard Scholz. 82-96 [doi]
- Enforcing Last-Level Cache Partitioning through Memory Virtual ChannelsJongwook Chung, Yuhwan Ro, Joonsung Kim, Jaehyung Ahn, Jangwoo Kim, John Kim, Jae W. Lee, Jung Ho Ahn. 97-109 [doi]
- To Stack or Not To StackRichard Afoakwa, Lejie Lu, Hui Wu, Michael Huang. 110-123 [doi]
- Enforcing Crash Consistency of Evolving Network Analytics in Non-Volatile Main Memory SystemsSoklong Lim, Zaixin Lu, Bin Ren, Xuechen Zhang. 124-137 [doi]
- Fooling the Sense of Cross-Core Last-Level Cache Eviction Based Attacker by Prefetching Common SenseBiswabandan Panda. 138-150 [doi]
- SpecShield: Shielding Speculative Data from Microarchitectural Covert ChannelsKristin Barber, Anys Bacha, Li Zhou, Yinqian Zhang, Radu Teodorescu. 151-164 [doi]
- MOSAIC: Heterogeneity-, Communication-, and Constraint-Aware Model Slicing and Execution for Accurate and Efficient InferenceMyeonggyun Han, Jihoon Hyun, Seongbeom Park, Jinsu Park, Woongki Baek. 165-177 [doi]
- Acorns: A Framework for Accelerating Deep Neural Networks with Input SparsityXiao Dong, Lei Liu, Peng Zhao, Guangli Li, Jiansong Li, Xueying Wang, Xiaobing Feng 0002. 178-191 [doi]
- Forgive-TM: Supporting Lazy Conflict Detection In Eager Hardware Transactional MemorySunjae Park, Christopher J. Hughes, Milos Prvulovic. 192-204 [doi]
- Unfair Scheduling Patterns in NUMA ArchitecturesNaama Ben-David, Ziv Scully, Guy E. Blelloch. 205-218 [doi]
- Optimizing Persistent Memory TransactionsPantea Zardoshti, Tingzhe Zhou, Yujie Liu, Michael F. Spear. 219-231 [doi]
- HeTM: Transactional Memory for Heterogeneous SystemsDaniel Castro 0004, Paolo Romano 0002, Aleksandar Illic, Amin M. Khan. 232-244 [doi]
- Achieving Scalability in a k-NN Multi-GPU Network Service with CentaurAmir Watad, Alexander Libov, Ohad Shacham, Edward Bortnikov, Mark Silberstein. 245-257 [doi]
- Analyzing and Leveraging Remote-Core Bandwidth for Enhanced Performance in GPUsMohamed Assem Ibrahim, Hongyuan Liu, Onur Kayiran, Adwait Jog. 258-271 [doi]
- Specialization Opportunities in Graphical WorkloadsLewis Crawford, Michael F. P. O'Boyle. 272-283 [doi]
- FindeR: Accelerating FM-Index-Based Exact Pattern Matching in Genomic Sequences through ReRAM TechnologyFarzaneh Zokaee, Mingzhe Zhang, Lei Jiang 0001. 284-295 [doi]
- SLAMBooster: An Application-Aware Online Controller for Approximation in Dense SLAMYan Pei, Swarnendu Biswas, Donald S. Fussell, Keshav Pingali. 296-310 [doi]
- Exploring Memory Persistency Models for GPUsZhen Lin, Mohammad A. Alshboul, Yan Solihin, Huiyang Zhou. 311-323 [doi]
- Adaptive Task Aggregation for High-Performance Sparse Solvers on GPUsAhmed E. Helal, Ashwin M. Aji, Michael L. Chu, Bradford M. Beckmann, Wu-chun Feng. 324-336 [doi]
- EDGE: Event-Driven GPU ExecutionTayler Hicklin Hetherington, Maria Lubeznov, Deval Shah, Tor M. Aamodt. 337-353 [doi]
- Generating Portable High-Performance Code via Multi-Dimensional HomomorphismsAri Rasch, Richard Schulze, Sergei Gorlatch. 354-369 [doi]
- Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One ShotTobias Gysi, Tobias Grosser, Torsten Hoefler. 370-382 [doi]
- Reducing Data Movement and Energy in Multilevel Cache Hierarchies without Losing Performance: Can you have it all?Jiajun Wang, Prakash Ramrakhyani, Wendy Elsasser, Lizy Kurian John. 383-394 [doi]
- Multiversioned Page Overlays: Enabling Faster Serializable Hardware Transactional MemoryZiqi Wang, Michael A. Kozuch, Todd C. Mowry, Vivek Seshadri. 395-408 [doi]
- Computing Three-Dimensional Constrained Delaunay Refinement Using the GPUZhenghai Chen, Tiow Seng Tan. 409-420 [doi]
- A Synchronization-Avoiding Distance-1 Grundy Coloring Algorithm for Power-Law GraphsJesun Sahariar Firoz, Marcin Zalewski, Andrew Lumsdaine. 421-432 [doi]
- Accelerating DCA++ (Dynamical Cluster Approximation) Scientific Application on the Summit SupercomputerGiovanni Balduzzi, Arghya Chatterjee, Ying Wai Li, Peter W. Doak, Urs R. Hähner, Eduardo F. D'Azevedo, Thomas A. Maier, Thomas C. Schulthess. 433-444 [doi]
- A Methodology for Characterizing Sparse Datasets and Its Application to SIMD Performance PredictionGangyi Zhu, Peng Jiang, Gagan Agrawal. 445-456 [doi]
- POSTER: Precise Capacity Planning for Database Public CloudsNingxin Zheng, Quan Chen, Yong Yang, Jin Li, Wenli Zheng, Minyi Guo. 457-458 [doi]
- POSTER: BioSEAL: In-Memory Biological Sequence Alignment Accelerator for Large-Scale Genomic DataRoman Kaplan, Leonid Yavits, Ran Ginosar. 459-460 [doi]
- POSTER: The Performance Impact of Thread Packing on Synchronization-Intensive ApplicationsJinsu Park, Seongbeom Park, Myeonggyun Han, Woongki Baek. 461-462 [doi]
- POSTER: Leveraging Run-Time Feedback for Efficient ASR AccelerationReza Yazdani, Jose-Maria Arnau, Antonio González 0001. 463-464 [doi]
- POSTER: Automatic Parallelization Targeting Asynchronous Task-Based RuntimesCharles Jin, Muthu Baskaran, Benoît Meister. 465-466 [doi]
- POSTER: Memory Hotspot Optimization for Data-Intensive ApplicationsXi Wang 0009, Jie Li, Antonino Tumeo, John D. Leidel, Yong Chen 0001. 467-468 [doi]
- POSTER: GPU Based Near Data Processing for Image Processing with Pattern Aware Data Allocation and PrefetchingJungwoo Choi, Boyeal Kim, Ji-Ye Jeon, Hyuk-Jae Lee, Euicheol Lim, Chae-Eun Rhee. 469-470 [doi]
- POSTER: Variable Sized Cache-Block CompactionSayantan Ray, Madhu Mutyam. 471-472 [doi]
- POSTER: Domain-Specialized Cache Management for Graph AnalyticsPriyank Faldu, Jeff Diamond, Boris Grot. 473-474 [doi]
- POSTER: Runtime Adaptations for Energy-Efficient VSLAMAbdullah Khalufa, Graham D. Riley, Mikel Luján. 475-476 [doi]
- POSTER: GIRAF: General Purpose In-Storage Resistive Associative FrameworkLeonid Yavits, Roman Kaplan, Ran Ginosar. 477-478 [doi]
- POSTER: An Optimized Predication Execution for SIMD ExtensionsAdrián Barredo, Juan M. Cebrian, Miquel Moretó, Marc Casas, Mateo Valero. 479-480 [doi]
- POSTER: Tango: An Optimizing Compiler for Just-In-Time RTL SimulationBlaise-Pascal Tine, Sudhakar Yalamanchili, Hyesoon Kim, Jeffrey S. Vetter. 481-482 [doi]
- POSTER: SPiDRE: Accelerating Sparse Memory Access PatternsAdrián Barredo, Jonathan C. Beard, Miquel Moretó. 483-484 [doi]
- POSTER: CogR: Exploiting Program Structures for Machine-Learning Based Runtime SolutionsHyojin Sung, Tong Chen, Alexandre E. Eichenberger, Kevin K. O'Brien. 485-486 [doi]
- POSTER: A Collaborative Multi-Factor Scheduler for Asymmetric Multicore ProcessorsTeng Yu, Pavlos Petoumenos, Vladimir Janjic, Mingcan Zhu, Hugh Leather, John Thomson. 487-488 [doi]
- POSTER: Space and Time Optimal DNN Primitive Selection with Integer Linear ProgrammingYuan Wen, Andrew Anderson 0001, Valentin Radu, Michael F. P. O'Boyle, David Gregg. 489-490 [doi]
- POSTER: Quiescent and Versioned Shadow Copies for NVMZhenwei Wu, Kai Lu, Wenzhe Zhang, Andrew Nisbet, Mikel Luján. 491-492 [doi]
- POSTER: AR-MMAP: Write Performance Improvement of Memory-Mapped FileSatoshi Imamura, Eiji Yoshida. 493-494 [doi]
- POSTER: Exploiting Multi-Level Task Dependencies to Prune Redundant Work in Relax-Ordered Task-Parallel AlgorithmsMasab Ahmad, Mohsin Shan, Akif Rehman, Omer Khan. 495-496 [doi]
- POSTER: Quantifying the Direct Overhead of Virtual Function Calls on Massively Parallel ArchitecturesMengchi Zhang, Roland N. Green, Timothy G. Rogers. 497-498 [doi]
- POSTER: A Polyhedral+Dataflow Intermediate Language for Performance ExplorationEddie C. Davis, Catherine Olschanowsky. 499-500 [doi]
- POSTER: Pairing Up CNNs for High Throughput Deep LearningBabak Zamirai, Salar Latifi, Scott A. Mahlke. 501-502 [doi]
- POSTER: A Memory-Access-Efficient Adaptive Implementation of kNN on FPGA through HLSXiaojia Song, Tao Xie 0004, Stephen Fischer. 503-504 [doi]