Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, Washington, DC, USA, February 16-20, 2019 - researchr publication

researchr

You are not signed in
Sign in
Sign up

Jeffrey K. Hollingsworth, Idit Keidar, editors, Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, Washington, DC, USA, February 16-20, 2019. ACM, 2019. [doi]

Conference: ppopp2019

Abstract is missing.

Beyond human-level accuracy: computational challenges in deep learningJoel Hestness, Newsha Ardalani, Gregory F. Diamos. 1-14 [doi]

S-EnKF: co-designing for scalable ensemble Kalman filterJunmin Xiao, Shijie Wang, Weiqiang Wan, Xuehai Hong, Guangming Tan. 15-26 [doi]

Throughput-oriented GPU memory allocationIsaac Gelado, Michael Garland. 27-37 [doi]

SEP-graph: finding shortest execution paths for graph processing under a hybrid framework on GPUHao Wang, Liang Geng, Rubao Lee, Kaixi Hou, Yanfeng Zhang, Xiaodong Zhang 0001. 38-52 [doi]

Incremental flattening for nested data parallelismTroels Henriksen, Frederik Thorøe, Martin Elsman, Cosmin E. Oancea. 53-67 [doi]

Adaptive sparse matrix-matrix multiplication on the GPUMartin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger. 68-81 [doi]

Modular transactions: bounding mixed races in space and timeBrijesh Dongol, Radha Jagadeesan, James Riely. 82-93 [doi]

Leveraging hardware TM in HaskellRyan Yates, Michael L. Scott. 94-106 [doi]

Stretching the capacity of hardware transactional memory in IBM POWER architecturesRicardo Filipe, Shady Issa, Paolo Romano 0002, João Pedro Barreto 0002. 107-119 [doi]

Processing transactions in a predefined orderMohamed M. Saad, Masoomeh Javidi Kishi, Shihao Jing, Sandeep Hans, Roberto Palmieri. 120-132 [doi]

Harmonia: a high throughput B+tree for GPUsZhaofeng Yan, Yuzhe Lin, Lu Peng, Weihua Zhang. 133-144 [doi]

Engineering a high-performance GPU B-TreeMuhammad A. Awad, Saman Ashkiani, Rob Johnson, Martin Farach-Colton, John D. Owens. 145-157 [doi]

QTLS: high-performance TLS asynchronous offload framework with Intel® QuickAssist technologyXiaokang Hu, Changzheng Wei, Jian Li 0021, Brian Will, Ping Yu, Lu Gong, Haibing Guan. 158-172 [doi]

Data-flow/dependence profiling for structured transformationsFabian Gruber, Manuel Selva, Diogo Sampaio, Christophe Guillon, Antoine Moynault, Louis-Noël Pouchet, Fabrice Rastello. 173-185 [doi]

Lightweight hardware transactional memory profilingQingsen Wang, Pengfei Su, Milind Chabbi, Xu Liu 0001. 186-200 [doi]

A pattern based algorithmic autotuner for graph processing on GPUsKe Meng, Jiajia Li, Guangming Tan, Ninghui Sun. 201-213 [doi]

Provably and practically efficient granularity controlUmut A. Acar, Vitaly Aksenov, Arthur Charguéraud, Mike Rainey. 214-228 [doi]

A coordinated tiling and batching framework for efficient GEMM on GPUsXiuhong Li, Yun Liang 0001, Shengen Yan, Liancheng Jia, Yinghan Li. 229-241 [doi]

Semantics-aware scheduling policies for synchronization determinismQi Zhao, Zhengyi Qiu, Guoliang Jin. 242-256 [doi]

Proactive work stealing for futuresKyle Singer, Yifan Xu, I-Ting Angelina Lee. 257-271 [doi]

A round-efficient distributed betweenness centrality algorithmLoc Hoang, Matteo Pontecorvi, Roshan Dathathri, Gurbinder Gill, Bozhi You, Keshav Pingali, Vijaya Ramachandran. 272-286 [doi]

Corrected trees for reliable group communicationMartin Küttler, Maksym Planeta, Jan Bierbaum, Carsten Weinhold, Hermann Härtig, Amnon Barak, Torsten Hoefler. 287-299 [doi]

Adaptive sparse tiling for sparse matrix multiplicationChangwan Hong, Aravind Sukumaran-Rajam, Israt Nisa, Kunal Singh, P. Sadayappan. 300-314 [doi]

Encapsulated open nesting for STM: fine-grained higher-level conflict detectionMartin Bättig, Thomas R. Gross. 315-326 [doi]

A specialized B-tree for concurrent datalog evaluationHerbert Jordan, Pavle Subotic, David Zhao, Bernhard Scholz. 327-339 [doi]

Efficient race detection with futuresRobert Utterback, Kunal Agrawal, Jeremy T. Fineman, I-Ting Angelina Lee. 340-354 [doi]

Verifying C11 programs operationallySimon Doherty, Brijesh Dongol, Heike Wehrheim, John Derrick. 355-365 [doi]

Checking linearizability using hitting familiesBurcu Kulahcioglu Ozkan, Rupak Majumdar, Filip Niksic. 366-377 [doi]

Transitive joins: a sound and efficient online deadlock-avoidance policyCaleb Voss, Tiago Cogumbreiro, Vivek Sarkar. 378-390 [doi]

VEBO: a vertex- and edge-balanced ordering heuristic to load balance parallel graph processingJiawen Sun, Hans Vandierendonck, Dimitrios S. Nikolopoulos. 391-392 [doi]

GPOP: a cache and memory-efficient framework for graph processing over partitionsKartik Lakhotia, Rajgopal Kannan, Sourav Pati, Viktor K. Prasanna. 393-394 [doi]

Optimizing graph processing on GPUs using approximate computing: posterSomesh Singh, Rupesh Nasre. 395-396 [doi]

A GPU memory efficient speed-up scheme for training ultra-deep neural networks: posterJinrong Guo, Wantao Liu, Wang Wang, Qu Lu, Songlin Hu, Jizhong Han, Ruixuan Li. 397-398 [doi]

Profiling based out-of-core hybrid method for large neural networks: posterYuki Ito, Haruki Imai, Tung D. Le, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matsumiya, Toshio Endo. 399-400 [doi]

Exploiting the input sparsity to accelerate deep neural networks: posterXiao Dong, Lei Liu, Guangli Li, Jiansong Li, Peng Zhao, Xueying Wang, Xiaobing Feng 0002. 401-402 [doi]

Accelerating distributed stochastic gradient descent with adaptive periodic parameter averaging: posterPeng Jiang, Gagan Agrawal. 403-404 [doi]

Optimizing GPU programs by register demotion: posterPutt Sakdhnagool, Amit Sabne, Rudolf Eigenmann. 405-406 [doi]

A distributed hypervisor for resource aggregation: posterYubin Chen, Zhuocheng Ding, Jin Zhang, Yun Wang, Zhengwei Qi, Haibing Guan. 407-408 [doi]

Scheduling HPC workloads on heterogeneous-ISA architectures: posterMohamed L. Karaoui, Anthony Carno, Robert Lyerly, Sang-Hoon Kim, Pierre Olivier, Changwoo Min, Binoy Ravindran. 409-410 [doi]

T-thinker: a task-centric distributed framework for compute-intensive divide-and-conquer algorithmsDa Yan, Guimu Guo, Md Mashiur Rahman Chowdhury, M. Tamer Özsu, John C. S. Lui, Weida Tan. 411-412 [doi]

Toward efficient architecture-independent algorithms for dynamic programs: posterMohammad Mahdi Javanmard, Pramod Ganapathr, Rathish Das, Zafar Ahmad, Stephen L. Tschudi, Rezaul Chowdhury. 413-414 [doi]

Optimizing computation-communication overlap in asynchronous task-based programs: posterEmilio Castillo, Nikhil Jain, Marc Casas, Miquel Moretó, Martin Schulz 0001, Ramón Beivide, Mateo Valero, Abhinav Bhatele. 415-416 [doi]

Lock-free channels for programming via communicating sequential processes: posterNikita Koval, Dan Alistarh, Roman Elizarov. 417-418 [doi]

Making concurrent algorithms detectable: posterNaama Ben-David, Guy E. Blelloch, Michal Friedman, Yuanhao Wei. 419-420 [doi]

GPU-based 3D cryo-EM reconstruction with key-value streams: posterKunpeng Wang, Shizhen Xu, Hongkun Yu, Haohuan Fu, Guangwen Yang. 421-422 [doi]

BASMAT: bottleneck-aware sparse matrix-vector multiplication auto-tuning on GPGPUsAthena Elafrou, Georgios I. Goumas, Nectarios Koziris. 423-424 [doi]

LOFT: lock-free transactional data structuresAvner Elizarov, Guy Golan-Gueta, Erez Petrank. 425-426 [doi]

Automated multi-dimensional elasticity for streaming runtimes: posterXiang Ni, Scott Schneider 0001, Raju Pavuluri, Jonathan Kaus, Kun-Lung Wu. 427-428 [doi]

Compiler-assisted adaptive program scheduling in big.LITTLE systems: posterMarcelo Novaes, Vinicius Petrucci, Abdoulaye Gamatié, Fernando Magno Quintão Pereira. 429-430 [doi]

GOPipe: a granularity-oblivious programming framework for pipelined stencil executions on GPUChanyoung Oh, Zhen Zheng, Xipeng Shen, Jidong Zhai, Youngmin Yi. 431-432 [doi]

High-throughput image alignment for connectomics using frugal snap judgments: posterTim Kaler, Brian Wheatman, Sarah Wooders. 433-434 [doi]

CuLDA_CGS: solving large-scale LDA problems on GPUsXiaolong Xie, Yun Liang 0001, Xiuhong Li, Wei Tan. 435-436 [doi]

Managing application parallelism via parallel efficiency regulation: posterSharanyan Srikanthan, Princeton Ferro, Sayak Chakraborti, Sandhya Dwarkadas. 437-438 [doi]

Blockchain abstract data type: posterEmmanuelle Anceaume, Antonella Del Pozzo, Romaric Ludinard, Maria Potop-Butucaru, Sara Tucci Piergiovanni. 439-440 [doi]

Creating repeatable, reusable experimentation pipelines with popper: tutorialIvo Jimenez, Jay F. Lofstead, Carlos Maltzahn. 441-442 [doi]

Building parallel programming language constructs in the AbleC extensible C compiler framework: a PPoPP tutorialTravis Carlson, Eric {Van Wyk}. 443-446 [doi]

Implementing parallel and concurrent tree structuresYihan Sun 0001, Guy E. Blelloch. 447-450 [doi]

Programming quantum computers: a primer with IBM Q and D-Wave exercisesFrank Mueller, Greg Byrd, Patrick Dreher. 451 [doi]

High performance distributed deep learning: a beginner's guideDhabaleswar K. Panda, Ammar Ahmad Awan, Hari Subramoni. 452-454 [doi]

Performance portable C++ programming with RAJADavid Beckingsale, Richard D. Hornung, Tom Scogland, Arturo Vargas. 455-456 [doi]

runs on WebDSL