Abstract is missing.
- Quantifying the Economic Potential of Energy-Aware Scheduling in HPCKevin Menear, James Donson, Suzy DiMont, Justin Strelka, Struan Clark, Michelle Slovensky. 1-12 [doi]
- Co-Design of a Power State-Aware Scheduler and an Intelligent Power Manager for Energy-Efficient HPC SystemsRaka Satya Prasasta, Santana Yuda Pradata, Kadek Gemilang Santiyuda, Muhammad Alfian Amrizal, Reza Pulungan, Hiroyuki Takizawa. 13-21 [doi]
- Combining System- and User-Level Approaches to Improving Energy Efficiency in GPU-Based SupercomputersKohei Yoshida, Hayato Yamaki, Hiroki Honda, Kento Sato, Shinobu Miwa. 22-30 [doi]
- System-Level Energy Profiling of Wafer-Scale AI Systems: Characterizing Non-Accelerator Overheads in the Cerebras CS-2 SystemJophin John, Hoi Fong Mak, Michael Hoffmann, Alice Zhang, Tapasya Patki, Nicolay Hammer. 31-39 [doi]
- Harvesting energy consumption on European HPC systems: Sharing Experience from the CEEC projectKajol Kulkarni, Samuel Kemmler, Anna Schwarz, Gülçin Gedik, Yanxiang Chen, Dimitrios Papageorgiou, Ioannis Kavroulakis, Roman Iakymchuk. 40-49 [doi]
- On the Challenges of Energy-Efficiency Analysis in HPC Systems: Evaluating Synthetic Benchmarks and GromacsRafael Ravedutti Lucio Machado, Jan Eitzinger, Georg Hager, Gerhard Wellein. 50-58 [doi]
- Leveraging NVML GPM for NVIDIA GPU MonitoringChristian Wassermann, Tobias Dollenbacher, Christian Terboven, Matthias S. Müller. 59-68 [doi]
- Providing Thermal Stability for an Exascale Supercomputer: A Case Study of Frontier's Cooling SystemDavid Grant, Luca Bortot, Chris DePrater, Dave Martinez, Ryan E. Grant, Natalie Bates. 69-78 [doi]
- Adaptive Approximation-Aware Scheduling for Heterogeneous Computing Using Reinforcement LearningEzgi Nur Alisan, Ismail Akturk. 79-88 [doi]
- Distributed Runtime Support for Portable and Scalable Execution of Heterogeneous ApplicationsFurkan Ozdemir, Ismail Akturk. 89-99 [doi]
- Score-P with Arm(s) around the world..Christian Feld, Gregor Corbin, Brian J. N. Wylie. 100-109 [doi]
- System Software Utilization on an ARM-Based Supercomputer: Insights from a Production-Scale SystemYosuke Asai, Kento Sato, Keiji Yamamoto, Hitoshi Murai, Kohei Yoshida. 110-121 [doi]
- Cross-architecture power efficiency analysis through micro-benchmarkingFabio Banchelli, Filippo Mantovani, Filippo Spiga. 122-132 [doi]
- Solving large-scale eigen problem in quantum few-body system on massive parallel computerDaisuke Yoshida, Emiko Hayama, Toshiyuki Imamura, Issaku Kanamori, Hideo Matsufuru. 133-144 [doi]
- Mixed precision solvers with half-precision floating point numbers for Lattice QCD on A64FX processorIssaku Kanamori, Hideo Matsufuru, Tatsumi Aoyama, Kazuyuki Kanaya, Yusuke Namekawa, Hidekatsu Nemura, Keigo Nitadori. 145-155 [doi]
- Prototyping an Autotuning Framework for Program Optimization Using Exo LanguageRin Iwai, Jens Domke, Emil Vatai, Yukinori Sato. 156-164 [doi]
- Hybrid Inference Optimization for AI-Enhanced Turbulent Boundary Layer Simulation on Heterogeneous SystemsFabian Orland, Tom Hilgers, Fabian Hübenthal, Rakesh Sarma, Andreas Lintermann, Christian Terboven. 165-176 [doi]
- CityScaleCast: Spatiotemporal GNN for City-Scale Weather Prediction with GraphCast-Guided Parallel Modeling and Multi-Step Forecasting in SendaiXuanwen Pan, Yoichi Shimomura, Sichen Tao, Masatoshi Kawai, Keichi Takahashi, Hiroyuki Takizawa. 177-185 [doi]
- Explainable AI-Guided Genetic Algorithms for Efficient Software Automatic TuningToshinobu Katayama, Masatoshi Kawai, Yoichi Shimomura, Keichi Takahashi, Hiroyuki Takizawa. 186-197 [doi]
- MemMan: A Fortran and C++ Interoperable Memory Manager on Modern High-Performance Computing PlatformsClaudius Holeksa, Terry Cojean, Yen-Chen Chen, Sergey Kosukhin, Jörg Behrens, Luis Kornblueh, Claudia Frauen, Hartwig Anzt. 198-207 [doi]
- Integrating Quantum and HPC: A Prototype Hybrid Implementation and Benchmark of Quantum-Selected Configuration InteractionIkko Hamamura, Toshiyuki Imamura, Takashi Arakawa, Shinji Sumimoto, Kazuya Yamazaki, Yao Hu, Shinnosuke Furuya, Kengo Nakajima. 208-218 [doi]
- On modeling knowledge graphs for representing and explaining wide-area distributed storage systemGabriele Morabito, Dante Sanzhez-Gallegos, Maria Fazio, Jesús Carretero 0001. 219-225 [doi]
- Efficient Data Elasticity for HPC: A Malleable Ad-hoc In-memory File System for Ephemeral DataGenaro Sanchez-Gallegos, Jesús Carretero 0001, Javier García Blas. 226-232 [doi]
- Porting and Evaluation of Lustre on a RISC-V Cluster for HPC Storage InfrastructureSurendra Billa, Rushikesh Jadhav, Yogeshwar Sonawane, Sanjay Wandhekar, Piyush Patle, Shaurya Rane. 233-238 [doi]
- Migration of Ginkgo's Jacobi-Preconditioned CG Solver to Vector RISC-VPatricia Siwinska, Héctor Martínez Pérez 0002, Adrián Castelló 0001. 239-246 [doi]
- NIMA-STEP: A Hardware-Software Co-Design Approach for Accelerating Cellular Automata ComputationNima Sahraneshinsamani, Sergio Barrachina-Mir. 247-254 [doi]
- An analysis of memory access patterns in RISC-V vector workloads on heterogeneous memory architecturesRyo Yokoyama, Masahito Kumagai, Kazuhiko Komatsu, Masayuki Sato 0001, Hiroaki Kobayashi. 255-262 [doi]
- Understanding LLM Checkpoint/Restore I/O Strategies and PatternsMikaila J. Gossman, Avinash Maurya, Bogdan Nicolae, Jon Calhoun 0001. 263-273 [doi]
- DeepEBC: Compressing the Pre-Trained LLMs with Error-Bounded Lossy CompressionJiaqi Xu, Zhaorui Zhang, Gaolin Wei, Sheng Di, Benben Liu, Xiaodong Yu, Xiaoyi Lu. 274-283 [doi]
- Performance and Area Optimization of Lossless Hardware Compression of Floating-Point Data StreamsLinyi Li, Jason Anderson 0001, Tomohiro Ueno. 284-292 [doi]
- Parallelising Stream-Based Lossless Data Compression on GPUs and CPUsYue Zhang, Yuki Hara, Oliver Sinnen, Shinichi Yamagiwa. 293-302 [doi]
- DGEMM using FP64 Arithmetic Emulation and FP8 Tensor Cores with Ozaki SchemeDaichi Mukunoki. 303-311 [doi]
- Micro-Benchmarking Communications Libraries on the MI300A Compute Partitioning ModesSimon Garcia De Gonzalo, Arha Gatram, Aadhav Saravanakumar. 312-322 [doi]
- Q-IRIS: The Evolution of the IRIS Task-Based Runtime to Enable Classical-Quantum WorkflowsNarasinga Rao Miniskar, Mohammad Alaul Haque Monil, Elaine Wong, Vicente Leyton-Ortega, Jeffrey S. Vetter, Seth R. Johnson, Travis S. Humble. 323-329 [doi]
- Orchid: Towards Heterogeneous Batched Eigenvalue SolversMatthew Chung, Keita Teranishi, Narasinga Rao Miniskar, Toshiyuki Imamura, Mohammad Alaul Haque Monil. 330-338 [doi]
- Large-Scale Vlasov Simulations for Astrophysics using Non-volatile Memory as Large MemoryNorihisa Fujita, Keita Ito, Kohji Yoshikawa, Kohei Hiraga, Osamu Tatebe, Akira Nukada, Taisuke Boku. 339-347 [doi]
- A Unidirectional Two-Compartment Neuron Circuit with On-chip STDP learningAshish Gautam, Shunta Furuichi, Takashi Kohno. 348-352 [doi]
- Evaluating Claude Code's Coding and Test Automation for GPU Acceleration ofa Legacy Fortran Application: A GeoFEM Case StudyTetsuya Hoshino, Shun-ichiro Hayashi, Daichi Mukunoki, Takahiro Katagiri, Toshihiro Hanawa. 353-360 [doi]
- Runtime Prediction for Local Deployment of Large Language Models: A Case Study on Qwen Models Covering LoRA Fine-Tuning, RAG, and InferenceJian Guo, Jianwen Wei, Yufei Cheng, Jiajie Sheng, Yijun Wu, Kento Sato. 361-369 [doi]
- Semantic Equivalence Verification of HPC Codes Using LLMsYuta Tanizawa, Masatoshi Kawai, Keichi Takahashi, Hiroyuki Takizawa. 370-377 [doi]
- Fine-Tuned Sentence-BERT for HPC Job Outcome Prediction via Textual Feature EmbeddingThanh Hoang Le Hai, Huy Nguyen Tuan, Bao Tran Dang, Bao Vo Thuong, Nam Thoai. 378-387 [doi]
- Sustainable Power Reduction for DRAM applying Adaptive Stream-based Entropy Coding focusing on Video Processing SystemTaku Nishikawa, Koichi Marumo, Shinichi Yamagiwa. 388-398 [doi]
- Exploring I/O Performance and Power Consumption Trade-offs in Production EnvironmentsZhaobin Zhu, Andreas Henkel, Sarah Neuwirth. 399-406 [doi]
- A Co-Simulation Framework for Building Energy Management as a Testbed for Energy-Aware Data Movement AnalysisChomphunuch Wongphong, Patchararat Wongta, Worawan Marurngsith. 407-417 [doi]
- Quantifying the Energy Cost of Performance Inefficiency in HPC ApplicationsRadita Liem, Dlyaver Djebarov. 418-427 [doi]
- Python-based Workflow System for Quantum-HPC Hybrid Application on HPC System and Quantum Computer with Shared NetworkSoratouch Pornmaneerattanatri, Miwako Tsuji, Ketan Maheshwari, Mitsuhisa Sato. 428-432 [doi]
- The LCLStream Ecosystem for Multi-Institutional Dataset ExplorationDavid M. Rogers 0001, Valerio Mariani, Cong Wang, Ryan N. Coffee, Wilko Kroeger, Murali Shankar, Hans Thorsten Schwander, Thomas L. Beck, Frédéric Poitevin, Jana Thayer. 433-442 [doi]
- Beyond Centralized Labs: Federating the Co-ScientistAmal Gueroudji, Alok Kamatar, Sabrina Chaouche, Robert B. Ross, Kyle Chard, Ian T. Foster. 443-449 [doi]
- xspline3d: A Python Library for MPI-Based Spline Interpolation Enforcing Global Continuity in Distributed 3D VolumesWenyang Zhao, Osamu Miyashita, Florence Tama. 450-456 [doi]
- Performance and Programmability of MPI+X Integration with CUDA, HIP, SYCL, OpenACC, and OpenMP Offloading for Supercomputing: A Case Study on Dense Matrix-Vector MultiplicationEzhilmathi Krishnasamy, James Throtter, Xing Cai, Dirk Pleiter, Leon Kos, Laura Saavedra, Pascal Bouvry. 457-468 [doi]