Abstract is missing.
- Hardware Specialization for Distributed ComputingGustavo Alonso. 1 [doi]
- Computing Challenges for High Energy PhysicsMaria Girone. 3 [doi]
- Superscalar Programming Models: A Perspective from BarcelonaRosa M. Badia. 5 [doi]
- A Serverless Framework for Distributed Bulk Metadata ExtractionTyler J. Skluzacek, Ryan Wong, Zhuozhao Li, Ryan Chard, Kyle Chard, Ian T. Foster. 7-18 [doi]
- File System Semantics Requirements of HPC ApplicationsChen Wang 0004, Kathryn Mohror, Marc Snir. 19-30 [doi]
- DStore: A Fast, Tailless, and Quiescent-Free Object Store for PMEMShashank Gugnani, Xiaoyi Lu. 31-43 [doi]
- Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality ModelingSian Jin, Jesus Pulido, Pascal Grosset, Jiannan Tian, Dingwen Tao, James P. Ahrens. 45-56 [doi]
- ARC: An Automated Approach to Resiliency for Lossy Compressed Data via Error Correcting CodesDakota Fulp, Alexandra Poulos, Robert Underwood, Jon C. Calhoun. 57-68 [doi]
- MPI-CorrBench: Towards an MPI Correctness Benchmark SuiteJan-Patrick Lehr, Tim Jammer, Christian Bischof. 69-80 [doi]
- Cache-aware Sparse Patterns for the Factorized Sparse Approximate Inverse PreconditionerSergi Laut, Ricard Borrell, Marc Casas. 81-93 [doi]
- TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware DatatypesCarl Pearson, Kun Wu, I-Hsin Chung, Jinjun Xiong, Wen-mei Hwu. 95-106 [doi]
- SnuRHAC: A Runtime for Heterogeneous Accelerator Clusters with CUDA Unified MemoryJaehoon Jung, Daeyoung Park, Gangwon Jo, Jungho Park, Jaejin Lee. 107-120 [doi]
- Scalable All-pairs Shortest Paths for Huge Graphs on Multi-GPU ClustersPiyush Sao, Hao Lu, Ramakrishnan Kannan, Vijay Thakkar, Richard W. Vuduc, Thomas E. Potok. 121-131 [doi]
- AITurbo: Unified Compute Allocation for Partial Predictable Training in Commodity ClustersLaiping Zhao, Fangshu Li, Wenyu Qu, Kunlin Zhan, Qingman Zhang. 133-145 [doi]
- Apollo: : An ML-assisted Real-Time Storage Resource ObserverNeeraj Rajesh, Hariharan Devarajan, Jaime Cernuda Garcia, Keith Bateman, Luke Logan, Jie Ye, Anthony Kougkas, Xian-He Sun. 147-159 [doi]
- An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural NetworksAlbert Njoroge Kahira, Truong Thao Nguyen, Leonardo Bautista-Gomez, Ryousei Takano, Rosa M. Badia, Mohamed Wahib. 161-173 [doi]
- DRLPart: A Deep Reinforcement Learning Framework for Optimally Efficient and Robust Resource Partitioning on Commodity ServersRuobing Chen 0002, Jinping Wu, Haosen Shi, Yusen Li, Xiaoguang Liu 0001, Gang Wang 0001. 175-188 [doi]
- Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly NetworkYao Kang, Xin Wang, Zhiling Lan. 189-200 [doi]
- Jigsaw: A High-Utilization, Interference-Free Job Scheduler for Fat-Tree ClustersStaci A. Smith, David K. Lowenthal. 201-213 [doi]
- Towards Exploiting CPU Elasticity via Efficient Thread OversubscriptionHang Huang, Jia Rao, Song Wu, Hai Jin 0001, Hong Jiang, Hao Che, Xiaofeng Wu. 215-226 [doi]
- DLion: Decentralized Distributed Deep Learning in Micro-CloudsRankyung Hong, Abhishek Chandra. 227-238 [doi]
- LaSS: Running Latency Sensitive Serverless Computations at the EdgeBin Wang, Ahmed Ali-Eldin, Prashant J. Shenoy. 239-251 [doi]
- Machine Learning Augmented Hybrid Memory ManagementThaleia Dimitra Doudali, Ada Gavrilovska. 253-254 [doi]
- Using Pilot Jobs and CernVM File System for Simplified Use of Containers and Software DistributionNamratha Urs, Marco Mambelli, Dave Dykstra. 255-256 [doi]
- Achieving Scalable Consensus by Being Less WriteyMichael Davis, Hans Vandierendonck. 257-258 [doi]
- Parallel Program Scaling Analysis using Hardware CountersShobhit Jagga, Preeti Malakar. 259-260 [doi]
- CharminG: A Scalable GPU-resident Runtime SystemJaemin Choi, David F. Richards, Laxmikant V. Kalé. 261-262 [doi]
- Productive Programming of Distributed Systems with the SHAD C++ LibraryVito Giovanni Castellana, Marco Minutoli. 263-264 [doi]