Abstract is missing.
- More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithmsVincent Gramoli. 1-10 [doi]
- The SprayList: a scalable relaxed priority queueDan Alistarh, Justin Kopinsky, Jerry Li, Nir Shavit. 11-20 [doi]
- Predicate RCU: an RCU for scalable concurrent updatesMaya Arbel, Adam Morrison. 21-30 [doi]
- Automatic scalable atomicity via semantic lockingGuy Golan-Gueta, G. Ramalingam, Mooly Sagiv, Eran Yahav. 31-41 [doi]
- A framework for practical parallel fast matrix multiplicationAustin R. Benson, Grey Ballard. 42-53 [doi]
- PLUTO+: near-complete modeling of affine transformations for parallelism and localityAravind Acharya, Uday Bondhugula. 54-64 [doi]
- Distributed memory code generation for mixed Irregular/Regular computationsMahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan. 65-75 [doi]
- Software partitioning of hardware transactionsLingxiang Xiang, Michael L. Scott. 76-86 [doi]
- Performance implications of dynamic memory allocators on transactional memory systemsAlexandro Baldassin, Edson Borin, Guido Araujo. 87-96 [doi]
- Low-overhead software transactional memory with progress guarantees and strong semanticsMinjia Zhang, Jipeng Huang, Man Cao, Michael D. Bond. 97-108 [doi]
- Barrier elision for production parallel programsMilind Chabbi, Wim Lavrijsen, Wibe De Jong, Koushik Sen, John M. Mellor-Crummey, Costin Iancu. 109-119 [doi]
- Scalable and efficient implementation of 3d unstructured meshes computation: a case study on matrix assemblyLoïc Thébault, Eric Petit, Quang Dinh. 120-129 [doi]
- Diagnosing the causes and severity of one-sided message contentionNathan R. Tallent, Abhinav Vishnu, Hubertus van Dam, Jeff Daily, Darren J. Kerbyson, Adolfy Hoisie. 130-139 [doi]
- A parallel algorithm for global states enumeration in concurrent systemsYen-Jung Chang, Vijay K. Garg. 140-149 [doi]
- Dynamic deadlock verification for general barrier synchronisationTiago Cogumbreiro, Raymond Hu, Francisco Martins, Nobuko Yoshida. 150-160 [doi]
- VirtCL: a framework for OpenCL device abstraction and managementYi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai, Yen-Ting Chao. 161-172 [doi]
- On optimizing machine learning workloads via kernel fusionArash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan. 173-182 [doi]
- NUMA-aware graph-structured analyticsKaiyuan Zhang, Rong Chen, Haibo Chen. 183-193 [doi]
- SYNC or ASYNC: time to fuse for distributed graph-parallel computationChenning Xie, Rong Chen, Haibing Guan, Binyu Zang, Haibo Chen. 194-204 [doi]
- Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiencyYuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi, Rezaul A. Chowdhury. 205-214 [doi]
- High performance locks for multi-level NUMA systemsMilind Chabbi, Michael W. Fagan, John M. Mellor-Crummey. 215-226 [doi]
- A library for portable and composable data locality optimizations for NUMA systemsZoltan Majo, Thomas R. Gross. 227-238 [doi]
- MPI+Threads: runtime contention and remediesAbdelhalim Amer, Huiwei Lu, Yanjie Wei, Pavan Balaji, Satoshi Matsuoka. 239-248 [doi]
- Fence placement for legacy data-race-free programs via synchronization read detectionAndrew J. McPherson, Vijay Nagarajan, Susmit Sarkar, Marcelo Cintra. 249-250 [doi]
- JAWS: a JavaScript framework for adaptive CPU-GPU work sharingXianglan Piao, Channoh Kim, Younghwan Oh, Huiying Li, Jincheon Kim, Hanjun Kim, Jae W. Lee. 251-252 [doi]
- GStream: a graph streaming processing method for large-scale graphs on GPUsHyunseok Seo, Jinwook Kim, Min-Soo Kim. 253-254 [doi]
- SemCache++: semantics-aware caching for efficient multi-GPU offloadingNabeel AlSaber, Milind Kulkarni. 255-256 [doi]
- An OpenACC-based unified programming model for multi-accelerator systemsJungwon Kim, Seyong Lee, Jeffrey S. Vetter. 257-258 [doi]
- The lazy happens-before relation: better partial-order reduction for systematic concurrency testingPaul Thomson, Alastair F. Donaldson. 259-260 [doi]
- Towards batched linear solvers on accelerated hardware platformsAzzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra. 261-262 [doi]
- A collection-oriented programming model for performance portabilitySaurav Muralidharan, Michael Garland, Bryan C. Catanzaro, Albert Sidelnik, Mary W. Hall. 263-264 [doi]
- Gunrock: a high-performance graph processing library on the GPUYangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens. 265-266 [doi]
- Decoupled load balancingOlga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Nancy M. Amato. 267-268 [doi]
- Combining phase identification and statistic modeling for automated parallel benchmark generationYe Jin, Mingliang Liu, Xiaosong Ma, Qing Liu, Jeremy S. Logan, Norbert Podhorszki, Jong Youl Choi, Scott Klasky. 269-270 [doi]
- Optimization of asynchronous graph processing on GPU with hybrid coloring modelXuanhua Shi, Junling Liang, Sheng Di, Bingsheng He, Hai Jin, Lu Lu, Zhixiang Wang, Xuan Luo, Jianlong Zhong. 271-272 [doi]
- Efficient and reasonable object-oriented concurrencyScott West, Sebastian Nanz, Bertrand Meyer. 273-274 [doi]
- A programming model and runtime system for significance-aware energy-efficient computingVassilis Vassiliadis, Konstantinos Parasyris, Charalambos Chalios, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, Dimitrios S. Nikolopoulos. 275-276 [doi]
- The lock-free k-LSM relaxed priority queueMartin Wimmer 0003, Jakob Gruber, Jesper Larsson Träff, Philippas Tsigas. 277-278 [doi]
- Static/Dynamic validation of MPI collective communications in multi-threaded contextEmmanuelle Saillard, Patrick Carribault, Denis Barthou. 279-280 [doi]
- CASTLE: fast concurrent internal binary search tree using edge-based lockingArunmoezhi Ramachandran, Neeraj Mittal. 281-282 [doi]
- Section based program analysis to reduce overhead of detecting unsynchronized thread communicationMadan Das, Gabriel Southern, Jose Renau. 283-284 [doi]
- A hierarchical approach to reducing communication in parallel graph algorithmsHarshvardhan, Nancy M. Amato, Lawrence Rauchwerger. 285-286 [doi]
- Tiles: a new language mechanism for heterogeneous parallelismYifeng Chen, Xiang Cui, Hong Mei. 287-288 [doi]
- Are web applications ready for parallelism?Cosmin Radoi, Stephan Herhut, Jaswanth Sreeram, Danny Dig. 289-290 [doi]