Abstract is missing.
- Coding the ContinuumIan T. Foster. 1 [doi]
- LACC: A Linear-Algebraic Algorithm for Finding Connected Components in Distributed MemoryAriful Azad, Aydin Buluç. 2-12 [doi]
- Shared-Memory Exact Minimum CutsMonika Henzinger, Alexander Noe, Christian Schulz 0003. 13-22 [doi]
- Distributed Weighted All Pairs Shortest Paths Through PipeliningUdit Agarwal, Vijaya Ramachandran. 23-32 [doi]
- Local Distributed Algorithms in Highly Dynamic NetworksPhilipp Bamberger, Fabian Kuhn, Yannic Maus. 33-42 [doi]
- Effects and Benefits of Node Sharing Strategies in HPC Batch SystemsAlvaro Frank, Tim Süß, André Brinkmann. 43-53 [doi]
- Design Space Exploration of Next-Generation HPC MachinesConstantino Gómez, Francesc Martínez, Adrià Armejach, Miquel Moretó, Filippo Mantovani, Marc Casas. 54-65 [doi]
- A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep LearningTal Ben-Nun, Maciej Besta, Simon Huber, Alexandros Nikolaos Ziogas, Daniel Peter, Torsten Hoefler. 66-77 [doi]
- Double-Precision FPUs in High-Performance Computing: An Embarrassment of Riches?Jens Domke, Kazuaki Matsumura, Mohamed Wahib, Haoyu Zhang, Keita Yashima, Toshiki Tsuchikawa, Yohei Tsuji, Artur Podobas, Satoshi Matsuoka. 78-88 [doi]
- Communication-Avoiding Cholesky-QR2 for Rectangular MatricesEdward Hutter, Edgar Solomonik. 89-100 [doi]
- Asynchronous Multigrid MethodsJordi Wolfson-Pou, Edmond Chow. 101-110 [doi]
- Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUsAhmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra. 111-122 [doi]
- Load-Balanced Sparse MTTKRP on GPUsIsrat Nisa, Jiajia Li 0001, Aravind Sukumaran-Rajam, Richard W. Vuduc, P. Sadayappan. 123-133 [doi]
- Practically Efficient Scheduler for Minimizing Average Flow Time of Parallel JobsKunal Agrawal, I-Ting Angelina Lee, Jing Li 0025, Kefu Lu, Benjamin Moseley. 134-144 [doi]
- Scheduling on (Un-)Related Machines with Setup TimesKlaus Jansen, Marten Maack, Alexander Mäcker. 145-154 [doi]
- A Scalable Clustering-Based Task Scheduler for Homogeneous Processors Using DAG PartitioningM. Yusuf Özkaya, Anne Benoit, Bora Uçar, Julien Herrmann, Ümit V. Çatalyürek. 155-165 [doi]
- Reservation Strategies for Stochastic JobsGuillaume Aupy, Ana Gainaru, Valentin Honoré, Padma Raghavan, Yves Robert, Hongyang Sun. 166-175 [doi]
- Exploiting Flow Graph of System of ODEs to Accelerate the Simulation of Biologically-Detailed Neural NetworksBruno R. C. Magalhães, Thomas Sterling, Felix Schürmann, Michael L. Hines. 176-187 [doi]
- Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network TrainingJiawen Liu, Dong Li, Gokcen Kestor, Jeffrey S. Vetter. 188-199 [doi]
- Dynamic Memory Management for GPU-Based Training of Deep Neural NetworksShriram S. B, Anshuj Garg, Purushottam Kulkarni. 200-209 [doi]
- Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained ParallelismNikoli Dryden, Naoya Maruyama, Tom Benson, Tim Moon, Marc Snir, Brian Van Essen. 210-220 [doi]
- Excavating the Potential of GPU for Accelerating Graph TraversalPengyu Wang, Lu Zhang, Chao Li, Minyi Guo. 221-230 [doi]
- ParILUT - A Parallel Threshold ILU for GPUsHartwig Anzt, Tobias Ribizel, Goran Flegar, Edmond Chow, Jack J. Dongarra. 231-241 [doi]
- C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA NetworksJie Zhang 0045, Xiaoyi Lu, Ching-Hsiang Chu, Dhabaleswar K. Panda. 242-251 [doi]
- Slate: Enabling Workload-Aware Efficient Multiprocessing for Modern GPGPUsTyler Allen, Xizhou Feng, Rong Ge 0002. 252-261 [doi]
- A Deep Recurrent Neural Network Based Predictive Control Framework for Reliable Distributed Stream Data ProcessingJielong Xu, Jian Tang 0008, Zhiyuan Xu, Chengxiang Yin, Kevin A. Kwiat, Charles A. Kamhoua. 262-272 [doi]
- Architecting Racetrack Memory Preshift through Pattern-Based Prediction MechanismsAdrian Colaso, Pablo Prieto, Pablo Abad Fidalgo, José-Ángel Gregorio, Valentin Puente. 273-282 [doi]
- DLHub: Model and Data Serving for ScienceRyan Chard, Zhuozhao Li, Kyle Chard, Logan T. Ward, Yadu N. Babuji, Anna Woodard, Steven Tuecke, Ben Blaiszik, Michael J. Franklin, Ian T. Foster. 283-292 [doi]
- Identifying Latent Reduced Models to Precondition Lossy CompressionHuizhang Luo, Dan Huang, Qing Liu 0002, Zhenbo Qiao, Hong Jiang 0001, Jing Bi, Haitao Yuan, MengChu Zhou, Jinzhen Wang, Zhenlu Qin. 293-302 [doi]
- QoS-Driven Coordinated Management of Resources to Save Energy in Multi-core SystemsMehrzad Nejat, Miquel Pericàs, Per Stenström. 303-313 [doi]
- Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore SystemsMd. Vasimuddin, Sanchit Misra, Heng Li, Srinivas Aluru. 314-324 [doi]
- Power and Performance Tradeoffs for Visualization AlgorithmsStephanie Labasan, Matthew Larsen, Hank Childs, Barry Rountree. 325-334 [doi]
- Northup: Divide-and-Conquer Programming in Systems with Heterogeneous Memories and ProcessorsShuai Che, Jieming Yin. 335-344 [doi]
- Distributed Approximate k-Core Decomposition and Min-Max Edge Orientation: Breaking the Diameter BarrierT.-H. Hubert Chan, Mauro Sozio, Bintao Sun. 345-354 [doi]
- FALCON: Efficient Designs for Zero-Copy MPI Datatype Processing on Emerging ArchitecturesJahanzeb Maqbool Hashmi, Sourav Chakraborty 0003, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda. 355-364 [doi]
- Two Elementary Instructions Make Compare-and-SwapPankaj Khanchandani, Roger Wattenhofer. 365-374 [doi]
- Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing SystemsJames Gentry, Chavit Denninnart, Mohsen Amini Salehi. 375-384 [doi]
- Two Roads to Parallelism: From Serial Code to Programming with STAPLLawrence Rauchwerger. 385 [doi]
- Z-Dedup: A Case for Deduplicating Compressed Contents in CloudZhichao Yan, Hong Jiang 0001, Yujuan Tan, Stan Skelton, Hao Luo. 386-395 [doi]
- An Architecture and Stochastic Method for Database Container Placement in the Edge-Fog-Cloud ContinuumPetar Kochovski, Rizos Sakellariou, Marko Bajec, Pavel Drobintsev, Vlado Stankovski. 396-405 [doi]
- Online Live VM Migration Algorithms to Minimize Total Migration Time and DowntimeNikos Tziritas, Thanasis Loukopoulos, Samee Khan, Cheng-Zhong Xu 0001, Albert Y. Zomaya. 406-417 [doi]
- Semantics-Aware Virtual Machine Image Management in IaaS CloudsNishant Saurabh, Julian Remmers, Dragi Kimovski, Radu Prodan, Jorge G. Barbosa. 418-427 [doi]
- Composing Optimization Techniques for Vertex-Centric Graph Processing via Communication ChannelsYongzhe Zhang, Zhenjiang Hu. 428-438 [doi]
- CuSP: A Customizable Streaming Edge Partitioner for Distributed Graph AnalyticsLoc Hoang, Roshan Dathathri, Gurbinder Gill, Keshav Pingali. 439-450 [doi]
- Accelerating Sequence Alignment to GraphsChirag Jain, Sanchit Misra, Haowen Zhang, Alexander T. Dilthey, Srinivas Aluru. 451-461 [doi]
- Accurate, Efficient and Scalable Graph EmbeddingHanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna. 462-471 [doi]
- Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External DeflationIchitaro Yamazaki, Zhaojun Bai, Ding Lu, Jack J. Dongarra. 472-481 [doi]
- Revisiting the I/O-Complexity of Fast Matrix Multiplication with RecomputationsRoy Nissim, Oded Schwartz. 482-490 [doi]
- Computation of Matrix Chain Products on Parallel MachinesElad Weiss, Oded Schwartz. 491-500 [doi]
- Overlapping Communications with Other Communications and Its Application to Distributed Dense Matrix ComputationsHua Huang, Edmond Chow. 501-510 [doi]
- Data Jockey: Automatic Data Management for HPC Multi-tiered Storage SystemsWoong Shin, Christopher Brumgard, Bing Xie, Sudharshan S. Vazhkudai, Devarshi Ghoshal, Sarp Oral, Lavanya Ramakrishnan. 511-522 [doi]
- NCQ-Aware I/O Scheduling for Conventional Solid State DrivesHao Fan, Song Wu 0001, Shadi Ibrahim, Ximing Chen, Hai Jin 0001, Jiang Xiao, Haibing Guan. 523-532 [doi]
- Optimizing the Parity Check Matrix for Efficient Decoding of RS-Based Cloud Storage SystemsJunqing Gu, Chentao Wu, Xin Xie, Han Qiu, Jie Li 0002, Minyi Guo, Xubin He, Yuanyuan Dong, Yafei Zhao. 533-544 [doi]
- D3: Deterministic Data Distribution for Efficient Data Reconstruction in Erasure-Coded Distributed Storage SystemsZhipeng Li, Min Lv, Yinlong Xu, Yongkun Li, Liangliang Xu. 545-556 [doi]
- SunwayLB: Enabling Extreme-Scale Lattice Boltzmann Method Based Computing Fluid Dynamics Simulations on Sunway TaihuLightZhao Liu, XueSen Chu, Xiaojing Lv, Hongsong Meng, Shupeng Shi, Wenji Han, Jingheng Xu, Haohuan Fu, Guangwen Yang. 557-566 [doi]
- Containers in HPC: A Scalability and Portability Study in Production Biological SimulationsOleksandr Rudyy, Marta Garcia-Gasulla, Filippo Mantovani, Alfonso Santiago, Raül Sirvent, Mariano Vázquez. 567-577 [doi]
- PaKman: Scalable Assembly of Large Genomes on Distributed Memory MachinesPriyanka Ghosh, Sriram Krishnamoorthy, Ananth Kalyanaraman. 578-589 [doi]
- Language Modeling at ScaleMd. Mostofa Ali Patwary, Milind Chabbi, Heewoo Jun, Jiaji Huang, Greg Diamos, Kenneth Church. 590-599 [doi]
- DYRS: Bandwidth-Aware Disk-to-Memory Migration of Cold Data in Big-Data File SystemsSimbarashe Dzinamarira, Florin Dinu, T. S. Eugene Ng. 600-609 [doi]
- iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File SystemsBharti Wadhwa, Arnab Kumar Paul, Sarah Neuwirth, Feiyi Wang, Sarp Oral, Ali Raza Butt, Jon Bernard, Kirk W. Cameron. 610-620 [doi]
- SimFS: A Simulation Data Virtualizing File System InterfaceSalvatore Di Girolamo, Pirmin Schmid, Thomas C. Schulthess, Torsten Hoefler. 621-630 [doi]
- Sizing and Partitioning Strategies for Burst-Buffers to Reduce IO ContentionGuillaume Aupy, Olivier Beaumont, Lionel Eyraud-Dubois. 631-640 [doi]
- On Optimizing Complex Stencils on GPUsPrashant Singh Rawat, Miheer Vaidya, Aravind Sukumaran-Rajam, Atanas Rountev, Louis-Noël Pouchet, P. Sadayappan. 641-652 [doi]
- Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUsWenyi Zhao, Quan Chen 0002, Hao Lin, Jianfeng Zhang, Jingwen Leng, Chao Li, Wenli Zheng, Li Li, Minyi Guo. 653-663 [doi]
- Exploiting Adaptive Data Compression to Improve Performance and Energy-Efficiency of Compute Workloads in Multi-GPU SystemsMohammad Khavari Tavana, Yifan Sun, Nicolas Bohm Agostini, David R. Kaeli. 664-674 [doi]
- Dual Pattern Compression Using Data-Preprocessing for Large-Scale GPU ArchitecturesKyung-Hoon Kim, Priyank Devpura, Abhishek Nayyar, Andrew Doolittle, Ki Hwan Yum, Eun Jung Kim. 675-685 [doi]
- Adapting Batch Scheduling to Workload Characteristics: What Can We Expect From Online Learning?Arnaud Legrand, Denis Trustram, Salah Zrigui. 686-695 [doi]
- Aladdin: Optimized Maximum Flow Management for Shared Production ClustersHeng Wu, Wenbo Zhang, Yuanjia Xu, Hao Xiang, Tao Huang 0001, Haiyang Ding, Zheng Zhang. 696-707 [doi]
- mmWave Wireless Backhaul Scheduling of Stochastic Packet ArrivalsPawel Garncarek, Tomasz Jurdzinski, Dariusz R. Kowalski, Miguel A. Mosteiro. 708-717 [doi]
- Tight & Simple Load BalancingPetra Berenbrink, Tom Friedetzky, Dominik Kaaser, Peter Kling. 718-726 [doi]
- The Path to Delivering Programable Exascale SystemsLuiz DeRose. 727 [doi]
- An Error-Reflective Consistency Model for Distributed Data StoresPhilip Dexter, Kenneth Chiu, Bedri Sendir. 728-737 [doi]
- A High-Performance Distributed Relational Database System for Scalable OLAP ProcessingJason Arnold, Boris Glavic, Ioan Raicu. 738-748 [doi]
- An Approach for Parallel Loading and Pre-Processing of Unstructured Meshes Stored in Spatially Scattered FashionOndrej Meca, Lubomír Ríha, Tomás Brzobohatý. 749-760 [doi]
- Exploring MPI Communication Models for Graph Applications Using Graph Matching as a Case StudySayan Ghosh, Mahantesh Halappanavar, Ananth Kalyanaraman, Arif Khan, Assefaw H. Gebremedhin. 761-770 [doi]
- BigSpa: An Efficient Interprocedural Static Analysis Engine in the CloudZhiqiang Zuo 0002, Rong Gu, Xi Jiang, Zhaokang Wang, Yihua Huang, Linzhang Wang, Xuandong Li. 771-780 [doi]
- An Efficient Collaborative Communication Mechanism for MPI Neighborhood CollectivesS. Mahdieh Ghazimirsaeed, Seyed Hessam Mirsadeghi, Ahmad Afsahi. 781-792 [doi]
- Understanding the Impact of Dynamic Power Capping on Application ProgressSrinivasan Ramesh, Swann Perarnau, Sridutt Bhalachandra, Allen D. Malony, Peter H. Beckman. 793-804 [doi]
- Modelling DVFS and UFS for Region-Based Energy Aware Tuning of HPC ApplicationsMohak Chadha, Michael Gerndt. 805-814 [doi]
- SprintCon: Controllable and Efficient Computational Sprinting for Data Center ServersWenli Zheng, Xiaorui Wang, Yue Ma, Chao Li, Hao Lin, Bin Yao, Jianfeng Zhang, Minyi Guo. 815-824 [doi]
- Drowsy-DC: Data Center Power Management SystemMathieu Bacou, Grégoire Todeschi, Alain Tchana, Daniel Hagimont, Baptiste Lepers, Willy Zwaenepoel. 825-834 [doi]
- Distributed Dominating Set and Connected Dominating Set Construction Under the Dynamic SINR ModelDongxiao Yu, Yifei Zou, Yong Zhang 0001, Feng Li, Jiguo Yu, Yu Wu 0010, Xiuzhen Cheng, Francis C. M. Lau 0001. 835-844 [doi]
- MULTISKIPGRAPH: A Self-Stabilizing Overlay Network that Maintains Monotonic SearchabilityLinghui Luo, Christian Scheideler, Thim Strothmann. 845-854 [doi]
- Network Size Estimation in Small-World Networks Under Byzantine FaultsSoumyottam Chatterjee, Gopal Pandurangan, Peter Robinson 0002. 855-865 [doi]
- MD-GAN: Multi-Discriminator Generative Adversarial Networks for Distributed DatasetsCorentin Hardy, Erwan Le Merrer, Bruno Sericola. 866-877 [doi]
- MOARD: Modeling Application Resilience to Transient Faults on Data ObjectsLuanzheng Guo, Dong Li. 878-889 [doi]
- SAFIRE: Scalable and Accurate Fault Injection for Parallel Multithreaded ApplicationsGiorgis Georgakoudis, Ignacio Laguna, Hans Vandierendonck, Dimitrios S. Nikolopoulos, Martin Schulz 0001. 890-899 [doi]
- Optimal Placement of In-memory Checkpoints Under Heterogeneous Failure LikelihoodsZaeem Hussain, Taieb Znati, Rami G. Melhem. 900-910 [doi]
- VeloC: Towards High Performance Adaptive Asynchronous Checkpointing at Large ScaleBogdan Nicolae, Adam Moody, Elsa Gonsiorowski, Kathryn Mohror, Franck Cappello. 911-920 [doi]
- HART: A Concurrent Hash-Assisted Radix Tree for DRAM-PM Hybrid Memory SystemsWen Pan, Tao Xie 0004, Xiaojia Song. 921-931 [doi]
- LLC-Guided Data Migration in Hybrid Memory SystemsEvangelos Vasilakis, Vassilis Papaefstathiou, Pedro Trancoso, Ioannis Sourdis. 932-942 [doi]
- Software-Based Buffering of Associative Operations on Random Memory AddressesMatthias Hauck, Marcus Paradies, Holger Fröning. 943-952 [doi]
- Combining Prefetch Control and Cache Partitioning to Improve Multicore PerformanceGongjin Sun, Junjie Shen, Alexander V. Veidenbaum. 953-962 [doi]
- UPC++: A High-Performance Communication Framework for Asynchronous ComputationJohn Bachan, Scott B. Baden, Steven A. Hofmeyr, Mathias Jacquelin, Amir Kamil, Dan Bonachea, Paul H. Hargrove, Hadia Ahmed. 963-973 [doi]
- Cpp-Taskflow: Fast Task-Based Parallel Programming Using Modern C++Tsung-Wei Huang, Chun-Xun Lin, Guannan Guo, Martin D. F. Wong. 974-983 [doi]
- Portal: A High-Performance Language and Compiler for Parallel N-Body ProblemsLaleh Aghababaie Beni, Saikiran Ramanan, Aparna Chandramowlishwaran. 984-995 [doi]
- SAC Goes Cluster: Fully Implicit Distributed ComputingThomas Macht, Clemens Grelck. 996-1006 [doi]
- Incremental Graph Processing for On-line AnalyticsScott Sallinen, Roger Pearce, Matei Ripeanu. 1007-1018 [doi]
- Incrementalization of Vertex-Centric ProgramsTimothy A. K. Zakian, Ludovic A. R. Capelli, Zhenjiang Hu. 1019-1029 [doi]
- GraphTinker: A High Performance Data Structure for Dynamic Graph ProcessingWole Jaiyeoba, Kevin Skadron. 1030-1041 [doi]
- FastJoin: A Skewness-Aware Distributed Stream Join SystemShunjie Zhou, Fan Zhang 0024, Hanhua Chen, Hai Jin 0001, Bing Bing Zhou. 1042-1052 [doi]
- A Bin-Based Bitstream Partitioning Approach for Parallel CABAC Decoding in Next Generation Video CodingPhilipp Habermann, Chi Ching Chi, Mauricio Alvarez Mesa, Ben H. H. Juurlink. 1053-1062 [doi]
- Stochastic Gradient Descent on Modern Hardware: Multi-core CPU or GPU? Synchronous or Asynchronous?Yujing Ma, Florin Rusu, Martin Torres. 1063-1072 [doi]
- Always be Two Steps Ahead of Your EnemyThorsten Götte, Vipin Ravindran Vijayalakshmi, Christian Scheideler. 1073-1082 [doi]
- Peace Through Superior Puzzling: An Asymmetric Sybil DefenseDiksha Gupta, Jared Saia, Maxwell Young. 1083-1094 [doi]
- Rethinking Support for Region Conflict ExceptionsSwarnendu Biswas, Rui Zhang, Michael D. Bond, Brandon Lucia. 1095-1106 [doi]