Abstract is missing.
- Welcome Message from the IEEE Cluster 2024 Program ChairsYutong Lu, Wuchun Feng, Mohamed Wahib. [doi]
- GPU Reliability Assessment: Insights Across the Abstraction LayersLishan Yang, George Papadimitriou 0001, Dimitris Sartzetakis, Adwait Jog, Evgenia Smirni, Dimitris Gizopoulos. 1-13 [doi]
- Siesta: Synthesizing Proxy Applications for MPI ProgramsJiyu Luo, Tao Yan, Qingguo Xu, Jingwei Sun, Guangzhong Sun. 14-26 [doi]
- Distributed Order Recording Techniques for Efficient Record-and-Replay of Multi - Threaded ProgramsXiang Fu, Shiman Meng, Weiping Zhang, Luanzheng Guo, Kento Sato, Dong H. Ahn, Ignacio Laguna, Gregory L. Lee, Martin Schulz 0001. 27-38 [doi]
- FTGraph: A Flexible Tree-Based Graph Store on Persistent Memory for Large-Scale Dynamic GraphsGan Sun, Jiang Zhou, Bo Li 0063, Xiaoyan Gu 0001, Weiping Wang 0005, Shuibing He. 39-50 [doi]
- PGSampler: Accelerating GPU-Based Graph Sampling in GNN Systems via Workload FusionXiaohui Wei 0002, Weikai Tang, Hao Qi, Hengshan Yue. 51-61 [doi]
- MassiveGNN: Efficient Training via Prefetching for Massively Connected Distributed GraphsAishwarya Sarkar, Sayan Ghosh, Nathan R. Tallent, Ali Jannesari. 62-73 [doi]
- A Protocol to Assess the Accuracy of Process-Level Power ModelsEmile Cadorel, Dimitri Saingre. 74-84 [doi]
- Holistic Performance Analysis for Asynchronous Many-Task RuntimesOmri Mor, George Bosilca, Marc Snir. 85-96 [doi]
- Automated Approach for Accurate CPU Power ModellingTomé Maseda, Jonatan Enes, Roberto R. Expósito, Juan Touriño. 97-107 [doi]
- MPI Collective Algorithm Selection in the Presence of Process Arrival PatternsMajid Salimi Beni, Biagio Cosenza, Sascha Hunold. 108-119 [doi]
- Optimizing Neighbor Collectives with Topology ObjectsGerald Collom, Derek Schafer, Amanda Bienz, Patrick G. Bridges, Galen M. Shipman. 120-130 [doi]
- A Topology- and Load-Aware Design for Neighborhood AllgatherHamed Sharifian, Amir Hossein Sojoodi, Ahmad Afsahi. 131-142 [doi]
- Uncut-GEMMs: Communication-Aware Matrix Multiplication on Multi-GPU NodesPetros Anastasiadis, Nikela Papadopoulou, Nectarios Koziris, Georgios I. Goumas. 143-154 [doi]
- High-Performance FFT Code Generation via MLIR Linalg Dialect and SIMD Micro-KernelsYifei He, Stefano Markidis. 155-165 [doi]
- Understanding Mixed Precision GEMM with MPGemmFI: Insights into Fault ResilienceBo Fang 0002, Xinyi Li, Harvey Dam, Cheng Tan 0002, Siva Kumar Sastry Hari, Timothy Tsai 0002, Ignacio Laguna, Dingwen Tao, Ganesh Gopalakrishnan, Prashant J. Nair, Kevin J. Barker, Ang Li 0006. 166-178 [doi]
- Parallelism or Fairness? How to Be Friendly for SSDs in Cloud EnvironmentsYang Zhou, Fang Wang 0001, Zhan Shi 0001, Dan Feng 0001. 179-189 [doi]
- SlackVM: Packing Virtual Machines in Oversubscribed Cloud InfrastructuresPierre Jacquet, Thomas Ledoux, Romain Rouvoy. 190-201 [doi]
- RL-Cache: An Efficient Reinforcement Learning Based Cache Partitioning Approach for Multi-Tenant CDN ServicesRanhao Jia, Zixiao Chen, Chentao Wu, Jie Li 0002, Minyi Guo, Hongwen Huang. 202-213 [doi]
- FCUFS: Core-Level Frequency Tuning for Energy Optimization on Intel ProcessorsHongjian Zhang, Akira Nukada, Qiucheng Liao. 214-225 [doi]
- ML-Based Dynamic Operator-Level Query Mapping for Stream Processing Systems in Heterogeneous Computing EnvironmentsSejeong Oh, Gordon Euhyun Moon, Sungyong Park. 226-237 [doi]
- Enabling Practical Transparent Checkpointing for MPI: A Topological Sort ApproachYao Xu, Gene Cooperman. 238-249 [doi]
- Enabling Workload-Driven Elasticity in MPI-based EnsemblesMd Rajib Hossen, Vanessa V. Sochat, Abhik Sarkar, Mohammad A. Islam 0001, Daniel J. Milroy. 250-262 [doi]
- Geo-Distributed Analytical Streaming Architecture for IoT PlatformsMohammad Reza Hoseiny Farahabady, Albert Y. Zomaya. 263-274 [doi]
- Seastar: A Cache-Efficient and Load-Balanced Key-Value Store on Disaggregated MemoryJingwen Du, Fang Wang 0001, Dan Feng 0001, Dexin Zeng, Sheng Yi. 275-285 [doi]
- HEFTLess: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing ContinuumReza Farahani, Narges Mehran, Sashko Ristov, Radu Prodan. 286-296 [doi]
- Job Scheduling in High Performance Computing Systems with Disaggregated Memory ResourcesJie Li 0057, George Michelogiannakis, Samuel Maloney, Brandon Cook 0001, Estela Suarez, John Shalf, Yong Chen 0001. 297-309 [doi]
- Fully Decentralized Data Distribution for Exascale-HPC: End of the Provider-Demander Matching PuzzleMingtian Shao, Wenzhe Zhang, Ruibo Wang, Huijun Wu 0001, Yiqin Dai, Kai Lu. 310-321 [doi]
- FT K-Means: A High-Performance K-Means on GPU with Fault ToleranceShixun Wu, Yitong Ding, Yujia Zhai, Jinyang Liu 0003, Jiajun Huang, Zizhe Jian, Huangliang Dai, Sheng Di, Bryan M. Wong, Zizhong Chen, Franck Cappello. 322-334 [doi]
- ScalFrag: Efficient Tiled-MTTKRP with Adaptive Launching on GPUsWenqing Lin, Hemeng Wang, Haodong Deng, Qingxiao Sun. 335-345 [doi]
- Leveraging High-Performance Data Transfer to Offload Data Management Tasks to SmartNICsScott Levy, Whit Schonbein, Craig D. Ulmer. 346-356 [doi]
- DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and DynamicsMeng Tang, Jaime Cernuda, Jie Ye, Luanzheng Guo, Nathan R. Tallent, Anthony Kougkas, Xian-He Sun. 357-369 [doi]
- Sizey: Memory-Efficient Execution of Scientific Workflow TasksJonathan Bader, Fabian Skalski, Fabian Lehmann, Dominik Scheinert, Jonathan Will, Lauritz Thamsen, Odej Kao. 370-381 [doi]
- Phase-Based Data Placement Optimization in Heterogeneous MemoryJannis Klinkenberg, Clément Foyer, Pierre Clouzet, Brice Goglin, Emmanuel Jeannot, Christian Terboven, Anara Kozhokanova. 382-393 [doi]
- Xphase3d: Memory-Distributed Phase Retrieval for Reconstructing Large-Scale 3D Density Maps of Biological MacromoleculesWenyang Zhao, Osamu Miyashita, Miki Nakano, Florence Tama. 394-402 [doi]
- Accuracy-Efficiency Optimization for Multi-Stage Small Object Detection in Surveillance Video with Collaborative Frame SamplingChunhong Du, Shanjiang Tang, Song Meng, Jiekai Gou, Ce Yu, Yusen Li, Hao Fu, Ye Tian, Ding Yuan. 403-413 [doi]
- Modernizing an Operational Real-Time Tsunami Simulator to Support Diverse Hardware PlatformsKeichi Takahashi, Takashi Abe, Akihiro Musa, Yoshihiko Sato, Yoichi Shimomura, Hiroyuki Takizawa, Shunichi Koshimura. 414-425 [doi]
- I/O Behind the Scenes: Bandwidth Requirements of HPC Applications with Asynchronous I/OAhmad Tarraf, Javier Fernández Muñoz, David E. Singh, Taylan Özden, Jesús Carretero 0001, Felix Wolf 0001. 426-439 [doi]
- FINCHFS: Design of Ad-Hoc File System for I/O Heavy HPC WorkloadsSohei Koyama, Kohei Hiraga, Osamu Tatebe. 440-450 [doi]
- A High-Performance and Fast-Recovery Scheme for Secure Non-Volatile Memory SystemsYujie Shi, Yu Hua, Jianming Huang 0001. 451-463 [doi]