Abstract is missing.
- SKV: A SmartNIC-Offloaded Distributed Key-Value StoreShangyi Sun, Rui Zhang, Ming Yan, Jie Wu 0003. 1-11 [doi]
- Painless Transposition of Reproducible Distributed Environments with NixOS ComposeQuentin Guilloteau, Jonathan Bleuzen, Millian Poquet, Olivier Richard. 1-12 [doi]
- Bring the BitCODE-Moving Compute and Data in Distributed Heterogeneous SystemsWenbin Lu, Luis E. Peña, Pavel Shamis, Valentin Churavy, Barbara M. Chapman, Steve Poole. 12-22 [doi]
- Exploring Light-weight Cryptography for Efficient and Secure Lossy Data CompressionRuiwen Shan, Sheng Di, Jon C. Calhoun, Franck Cappello. 23-34 [doi]
- What does Inter-Cluster Job Submission and Execution Behavior Reveal to Us?Tirthak Patel, Devesh Tiwari, Raj Kettimuthu, William E. Allcock, Paul Rich, Zhengchun Liu. 35-46 [doi]
- MRSch: Multi-Resource Scheduling for HPCBoyang Li, Yuping Fan, Matthew T. Dearing, Zhiling Lan, Paul Rich, William E. Allcock, Michael E. Papka. 47-57 [doi]
- Matching-based Scheduling of Asynchronous Data Processing Workflows on the Computing ContinuumNarges Mehran, Zahra Najafabadi Samani, Dragi Kimovski, Radu Prodan. 58-70 [doi]
- Spark Meets MPI: Towards High-Performance Communication Framework for Spark using MPIKinan Al-Attar, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda 0001. 71-81 [doi]
- Deadlock Detection for MPI Programs Based on Refined Match-setsShushan Li, Meng Wang, Hong Zhang. 82-93 [doi]
- A framework for hierarchical single-copy MPI collectives on multicore nodesGeorge Katevenis, Manolis Ploumidis, Manolis Marazakis. 94-105 [doi]
- PYTHIA: an oracle to guide runtime system decisionsAlexis Colin, François Trahay, Denis Conan. 106-116 [doi]
- Pushing the Boundaries of Small Tasks: Scalable Low-Overhead Data-Flow Programming in TTGJoseph Schuchart, Poornima Nookala, Thomas Hérault, Edward F. Valeev, George Bosilca. 117-128 [doi]
- Distributed Continuation Stealing is More Scalable than You Might ThinkShumpei Shiina, Kenjiro Taura. 129-141 [doi]
- Fast(er) Construction of Round-optimal $n$-Block Broadcast SchedulesJesper Larsson Träff. 142-151 [doi]
- Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUsSébastien Cayrols, Jiali Li, George Bosilca, Stanimire Tomov, Alan Ayala, Jack J. Dongarra. 152-160 [doi]
- ACCLAiM: Advancing the Practicality of MPI Collective Communication Autotuning Using Machine LearningMichael Wilkins, Yanfei Guo, Rajeev Thakur, Peter A. Dinda, Nikos Hardavellas. 161-171 [doi]
- Call Scheduling to Reduce Response Time of a FaaS SystemPawel Zuk, Bartlomiej Przybylski, Krzysztof Rzadca. 172-182 [doi]
- FaaSt: Optimize makespan of serverless workflows in federated commercial FaaSSashko Ristov, Philipp Gritsch. 183-194 [doi]
- Last-mile Matters: Mitigating the Tail Latency of Virtualized Networks with Multipath Data PlaneDian Shen, Yi Zhai, Fang Dong 0001, Junzhou Luo. 195-205 [doi]
- Towards Virtual Certification of Gas Turbine Engines With Performance-Portable SimulationsGihan R. Mudalige, István Z. Reguly, Arun Prabhakar, Dario Amirante, Leigh Lapworth, Stephen A. Jarvis. 206-217 [doi]
- Hybrid Analysis of Fusion Data for Online Understanding of Complex Science on Extreme Scale ComputersEric Suchyta, Jong Youl Choi, Seung-Hoe Ku, David Pugmire, Ana Gainaru, Kevin A. Huck, Ralph Kube, Aaron Scheinberg, Frédéric Suter, Choong-Seock Chang, Todd S. Munson, Norbert Podhorszki, Scott Klasky. 218-229 [doi]
- High Performance Adaptive Physics Refinement to Enable Large-Scale Tracking of Cancer Cell TrajectoryDaniel F. Puleri, Sayan Roychowdhury, Peter Balogh, John Gounley, Erik W. Draeger, Jeff Ames, Adebayo Adebiyi, Simbarashe Chidyagwai, Benjamín Hernández, Seyong Lee, Shirley V. Moore, Jeffrey S. Vetter, Amanda Randles. 230-242 [doi]
- Extracting and characterizing I/O behavior of HPC workloadsHariharan Devarajan, Kathryn Mohror. 243-255 [doi]
- Be SMART, Save I/O: A Probabilistic Approach to Avoid Uncorrectable Errors in Storage SystemsMd. Arifuzzaman, Masudul Bhuiyan, Mehmet Gümüs, Engin Arslan. 256-266 [doi]
- The role of storage target allocation in applications' I/O performance with BeeGFSFrancieli Boito, Guillaume Pallez, Luan Teylo. 267-277 [doi]
- ecoHMEM: Improving Object Placement Methodology for Hybrid Memory Systems in HPCMarc Jordà, Siddharth Rai, Eduard Ayguadé, Jesús Labarta, Antonio J. Peña. 278-288 [doi]
- Efficient Hierarchical State Vector Simulation of Quantum Circuits via Acyclic Graph PartitioningBo Fang, M. Yusuf Özkaval, Ang Li, Ümit V. Çatalyürek, Sriram Krishnamoorthy. 289-300 [doi]
- AutoPipe: A Fast Pipeline Parallelism Approach with Balanced Partitioning and Micro-batch SlicingWeijie Liu, Zhiquan Lai, Shengwei Li, Yabo Duan, Keshi Ge, Dongsheng Li. 301-312 [doi]
- HPH: Hybrid Parallelism on Heterogeneous Clusters for Accelerating Large-scale DNNs TrainingYabo Duan, Zhiquan Lai, Shengwei Li, Weijie Liu, Keshi Ge, Peng Liang, Dongsheng Li. 313-323 [doi]
- Hvac: Removing I/O Bottleneck for Large-Scale Deep Learning ApplicationsAwais Khan 0002, Arnab K. Paul, Christopher Zimmer 0001, Sarp Oral, Sajal Dash, Scott Atchley, Feiyi Wang. 324-335 [doi]
- Enabling Dynamic Virtual Frequency Scaling for Virtual Machines in the CloudEmile Cadorel, Romain Rouvoy. 336-346 [doi]
- The Cost of Flexibility: Embedded versus Discrete Routers in CGRAs for HPCBoma A. Adhi, Carlos Cortes, Yiyu Tan, Takuya Kojima, Artur Podobas, Kentaro Sano. 347-356 [doi]
- SVAGC: Garbage Collection with a Scalable Virtual Address Swapping TechniqueIsmail Ataie, Weikuan Yu. 357-368 [doi]
- ALBADross: Active Learning Based Anomaly Diagnosis for Production HPC SystemsBurak Aksar, Efe Sencan, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Brian Kulis, Ayse K. Coskun. 369-380 [doi]
- HPC Storage Service Autotuning Using Variational- Autoencoder -Guided Asynchronous Bayesian OptimizationMatthieu Dorier, Romain Egele, Prasanna Balaprakash, Jaehoon Koo, Sandeep Madireddy, Srinivasan Ramesh, Allen D. Malony, Robert B. Ross. 381-393 [doi]
- fairDMS: Rapid Model Training by Data and Model ReuseAhsan Ali, Hemant Sharma, Rajkumar Kettimuthu, Peter Kenesei, Dennis Trujillo, Antonino Miceli, Ian T. Foster, Ryan Coffee, Jana Thayer, Zhengchun Liu. 394-405 [doi]
- Integrating process, control-flow, and data resiliency layers using a hybrid Fenix/Kokkos approachMatthew Whitlock, Nicolas Morales, George Bosilca, Aurelien Bouteiller, Bogdan Nicolae, Keita Teranishi, Elisabeth Giem, Vivek Sarkar. 418-428 [doi]
- Fast Dynamic Updates and Dynamic SpGEMM on MPI-Distributed GraphsAlexander van der Grinten, Geert Custers, Duy Le Thanh, Henning Meyerhenke. 429-439 [doi]
- BALA-CPD: BALanced and Asynchronous Distributed Tensor DecompositionZheng Miao, Jiajia Li, Jon C. Calhoun, Rong Ge 0002. 440-450 [doi]
- Optimizing Irregular-Shaped Matrix-Matrix Multiplication on Multi-Core DSPsShangfei Yin, Qinglin Wang, Ruochen Hao, Tianyang Zhou, Songzhu Mei, Jie Liu 0002. 451-461 [doi]
- Optimizations of H-matrix-vector Multiplication for Modern Multi-core ProcessorsTetsuya Hoshino, Akihiro Ida, Toshihiro Hanawa. 462-472 [doi]
- Recursive Multi-Section on the Fly: Shared-Memory Streaming Algorithms for Hierarchical Graph Partitioning and Process MappingMarcelo Fonseca Faraj, Christian Schulz 0003. 473-483 [doi]
- MemGaze: Rapid and Effective Load-Level Memory Trace AnalysisOzgur O. Kilic, Nathan R. Tallent, Yasodha Suriyakumar, Chenhao Xie 0001, Andrès Márquez, Stéphane Eranian. 484-495 [doi]
- Empirical Study on the GPU-accelerated HPL Performance: Effects of PCIe CommunicationJieun Choi, Yosang Jeong, Ji Hoon Kang 0002, Gibeom Gu, Hoon Ryu. 496-497 [doi]
- H2M: Towards Heuristics for Heterogeneous MemoryClément Foyer, Brice Goglin, Emmanuel Jeannot, Jannis Klinkenberg, Anara Kozhokanova, Christian Terboven. 498-499 [doi]
- An Analysis of Performance Variability on Dragonfly+topologyMajid Salimi Beni, Biagio Cosenza. 500-501 [doi]
- An Asynchronous Parallel Algorithm to Improve the Scalability of Finite Element SolversZhuo Tian, Changyou Zhang. 502-503 [doi]
- An Efficient Sparse CNNs Accelerator on FPGAYonghua Zhang, Hongxu Jiang, Xiaobin Li, Haojie Wang, Dong Dong, Yongxiang Cao. 504-505 [doi]
- A Performance Evaluation of Adaptive MPI for a Particle-In-Cell CodeChristian Asch, Diego Jiménez, Markus Rampp, Erwin Laure, Esteban Meneses. 506-511 [doi]
- Scalable Architectures to Support Sustainable Advanced Information TechnologiesOscar Carrillo, Carlos Jaime Barrios Hernández, Frédéric Le Mouël, Harold Enrique Castro Barrera, Yves Denneulin, José Tiberio Hernández, Fernando Jiménez Vargas, Lola Xiomara Bautista Rozo, Claudia Roncancio, Michel Riveill. 512-515 [doi]
- Early Experiences of Noise-Sensitivity Performance Analysis of a Distributed Deep Learning FrameworkElvis Rojas, Michael Knobloch, Nour Daoud, Esteban Meneses, Bernd Mohr. 516-522 [doi]
- Learning tenant behavior and evolutionary approach for demand response in colocation datacentersJonathan Muraña, Santiago Iturriaga, Sergio Nesmachnow. 523-527 [doi]
- Impact of Containerization on Low-Cost Post Moore Computing ArchitecturesPablo Josue Rojas Yepes, Carlos Jaime Barrios Hernández, Luiz Angelo Steffenel. 528-534 [doi]
- Automatic vehicle counting area creation based on vehicle Deep Learning detection and DBSCANGerardo Alvarez Piña, Eduardo Ulises Moya-Sánchez, Abraham Sánchez Pérez, Ulises Cortés. 535-538 [doi]
- On Using Linux Kernel Huge Pages with FLASH, an Astrophysical Simulation CodeAlan C. Calder, Catherine Feldman, Eva Siegmann, John Dey, Anthony Curtis, Smeet Chheda, Robert J. Harrison. 539-544 [doi]
- Performance of an Astrophysical Radiation Hydrodynamics Code under Scalable Vector Extension OptimizationDennis C. Smolarski, F. Douglas Swesty, Alan C. Calder. 545-548 [doi]
- Productivity meets Performance: Julia on A64FXMosè Giordano, Milan Klöwer, Valentin Churavy. 549-555 [doi]
- Assessing the State of Autovectorization Support based on SVEBine Brank, Dirk Pleiter. 556-562 [doi]
- Performance analysis of a state vector quantum circuit simulation on A64FX processorMiwako Tsuji, Mitsuhisa Sato. 563-572 [doi]
- Protecting Metadata Servers From Harm Through Application-level I/O ControlRicardo Macedo, Mariana Miranda, Yusuke Tanimura, Jason Haga, Amit Ruhela, Stephen Lien Harrell, Richard Todd Evans, João Paulo 0001. 573-580 [doi]
- A Comprehensive I/O Knowledge Cycle for Modular and Automated HPC Workload AnalysisZhaobin Zhu, Sarah Neuwirth, Thomas Lippert. 581-588 [doi]
- Assessment of the I/O and Storage Subsystem in Modular Supercomputing ArchitecturesSarah Neuwirth. 589-596 [doi]
- Towards Real- Time Classification of HPC Workloads via Out-of-band TelemetrySteven Presser. 597-601 [doi]
- Shasta Log Aggregation, Monitoring and Alerting in HPC Environments with Grafana Loki and ServiceNowElizabeth Bautista, Nitin Sukhija, Siqi Deng. 602-610 [doi]
- Bridging the Gap between Application Performance Analysis and System MonitoringThomas Ilsche, Mario Bielert, Christian von Elm. 611-615 [doi]
- IncProf: Efficient Source-Oriented Phase Identification for Application Behavior UnderstandingOmar Aaziz, Mohammad Al-Tahat, Strahinja Trecakov, Jonathan Cook. 616-625 [doi]
- LDMS Darshan Connector: For Run Time Diagnosis of HPC Application I/O PerformanceSara Walton, Omar Aaziz, Ana Luisa Veroneze Solórzano, Benjamin Schwaller. 626-634 [doi]