Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, February 7-11, 2015 - researchr publication

researchr

You are not signed in
Sign in
Sign up

Albert Cohen, David Grove, editors, Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, February 7-11, 2015. ACM, 2015. [doi]

Conference: ppopp2015

Abstract is missing.

More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithmsVincent Gramoli. 1-10 [doi]

The SprayList: a scalable relaxed priority queueDan Alistarh, Justin Kopinsky, Jerry Li, Nir Shavit. 11-20 [doi]

Predicate RCU: an RCU for scalable concurrent updatesMaya Arbel, Adam Morrison. 21-30 [doi]

Automatic scalable atomicity via semantic lockingGuy Golan-Gueta, G. Ramalingam, Mooly Sagiv, Eran Yahav. 31-41 [doi]

A framework for practical parallel fast matrix multiplicationAustin R. Benson, Grey Ballard. 42-53 [doi]

PLUTO+: near-complete modeling of affine transformations for parallelism and localityAravind Acharya, Uday Bondhugula. 54-64 [doi]

Distributed memory code generation for mixed Irregular/Regular computationsMahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan. 65-75 [doi]

Software partitioning of hardware transactionsLingxiang Xiang, Michael L. Scott. 76-86 [doi]

Performance implications of dynamic memory allocators on transactional memory systemsAlexandro Baldassin, Edson Borin, Guido Araujo. 87-96 [doi]

Low-overhead software transactional memory with progress guarantees and strong semanticsMinjia Zhang, Jipeng Huang, Man Cao, Michael D. Bond. 97-108 [doi]

Barrier elision for production parallel programsMilind Chabbi, Wim Lavrijsen, Wibe De Jong, Koushik Sen, John M. Mellor-Crummey, Costin Iancu. 109-119 [doi]

Scalable and efficient implementation of 3d unstructured meshes computation: a case study on matrix assemblyLoïc Thébault, Eric Petit, Quang Dinh. 120-129 [doi]

Diagnosing the causes and severity of one-sided message contentionNathan R. Tallent, Abhinav Vishnu, Hubertus van Dam, Jeff Daily, Darren J. Kerbyson, Adolfy Hoisie. 130-139 [doi]

A parallel algorithm for global states enumeration in concurrent systemsYen-Jung Chang, Vijay K. Garg. 140-149 [doi]

Dynamic deadlock verification for general barrier synchronisationTiago Cogumbreiro, Raymond Hu, Francisco Martins, Nobuko Yoshida. 150-160 [doi]

VirtCL: a framework for OpenCL device abstraction and managementYi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai, Yen-Ting Chao. 161-172 [doi]

On optimizing machine learning workloads via kernel fusionArash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan. 173-182 [doi]

NUMA-aware graph-structured analyticsKaiyuan Zhang, Rong Chen, Haibo Chen. 183-193 [doi]

SYNC or ASYNC: time to fuse for distributed graph-parallel computationChenning Xie, Rong Chen, Haibing Guan, Binyu Zang, Haibo Chen. 194-204 [doi]

Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiencyYuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi, Rezaul A. Chowdhury. 205-214 [doi]

High performance locks for multi-level NUMA systemsMilind Chabbi, Michael W. Fagan, John M. Mellor-Crummey. 215-226 [doi]

A library for portable and composable data locality optimizations for NUMA systemsZoltan Majo, Thomas R. Gross. 227-238 [doi]

MPI+Threads: runtime contention and remediesAbdelhalim Amer, Huiwei Lu, Yanjie Wei, Pavan Balaji, Satoshi Matsuoka. 239-248 [doi]

Fence placement for legacy data-race-free programs via synchronization read detectionAndrew J. McPherson, Vijay Nagarajan, Susmit Sarkar, Marcelo Cintra. 249-250 [doi]

JAWS: a JavaScript framework for adaptive CPU-GPU work sharingXianglan Piao, Channoh Kim, Younghwan Oh, Huiying Li, Jincheon Kim, Hanjun Kim, Jae W. Lee. 251-252 [doi]

GStream: a graph streaming processing method for large-scale graphs on GPUsHyunseok Seo, Jinwook Kim, Min-Soo Kim. 253-254 [doi]

SemCache++: semantics-aware caching for efficient multi-GPU offloadingNabeel AlSaber, Milind Kulkarni. 255-256 [doi]

An OpenACC-based unified programming model for multi-accelerator systemsJungwon Kim, Seyong Lee, Jeffrey S. Vetter. 257-258 [doi]

The lazy happens-before relation: better partial-order reduction for systematic concurrency testingPaul Thomson, Alastair F. Donaldson. 259-260 [doi]

Towards batched linear solvers on accelerated hardware platformsAzzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra. 261-262 [doi]

A collection-oriented programming model for performance portabilitySaurav Muralidharan, Michael Garland, Bryan C. Catanzaro, Albert Sidelnik, Mary W. Hall. 263-264 [doi]

Gunrock: a high-performance graph processing library on the GPUYangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens. 265-266 [doi]

Decoupled load balancingOlga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Nancy M. Amato. 267-268 [doi]

Combining phase identification and statistic modeling for automated parallel benchmark generationYe Jin, Mingliang Liu, Xiaosong Ma, Qing Liu, Jeremy S. Logan, Norbert Podhorszki, Jong Youl Choi, Scott Klasky. 269-270 [doi]

Optimization of asynchronous graph processing on GPU with hybrid coloring modelXuanhua Shi, Junling Liang, Sheng Di, Bingsheng He, Hai Jin, Lu Lu, Zhixiang Wang, Xuan Luo, Jianlong Zhong. 271-272 [doi]

Efficient and reasonable object-oriented concurrencyScott West, Sebastian Nanz, Bertrand Meyer. 273-274 [doi]

A programming model and runtime system for significance-aware energy-efficient computingVassilis Vassiliadis, Konstantinos Parasyris, Charalambos Chalios, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, Dimitrios S. Nikolopoulos. 275-276 [doi]

The lock-free k-LSM relaxed priority queueMartin Wimmer 0003, Jakob Gruber, Jesper Larsson Träff, Philippas Tsigas. 277-278 [doi]

Static/Dynamic validation of MPI collective communications in multi-threaded contextEmmanuelle Saillard, Patrick Carribault, Denis Barthou. 279-280 [doi]

CASTLE: fast concurrent internal binary search tree using edge-based lockingArunmoezhi Ramachandran, Neeraj Mittal. 281-282 [doi]

Section based program analysis to reduce overhead of detecting unsynchronized thread communicationMadan Das, Gabriel Southern, Jose Renau. 283-284 [doi]

A hierarchical approach to reducing communication in parallel graph algorithmsHarshvardhan, Nancy M. Amato, Lawrence Rauchwerger. 285-286 [doi]

Tiles: a new language mechanism for heterogeneous parallelismYifeng Chen, Xiang Cui, Hong Mei. 287-288 [doi]

Are web applications ready for parallelism?Cosmin Radoi, Stephan Herhut, Jaswanth Sreeram, Danny Dig. 289-290 [doi]

runs on WebDSL