Abstract is missing.
- A High-Performance MST Implementation for GPUsAlex Fallin, Andres Gonzalez, Jarim Seo, Martin Burtscher. [doi]
- SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy SavingKaijie Fan, Marco D'Antonio, Lorenzo Carpentieri, Biagio Cosenza, Federico Ficarelli, Daniele Cesarini. [doi]
- FORGE: Pre-Training Open Foundation Models for ScienceJunqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar. [doi]
- Big Data Assimilation: Real-time 30-second-refresh Heavy Rain Forecast Using Fugaku During Tokyo Olympics and ParalympicsTakemasa Miyoshi, Arata Amemiya, Shigenori Otsuka, Yasumitsu Maejima, James Taylor, Takumi Honda, Hirofumi Tomita, Seiya Nishizawa, Kenta Sueki, Tsuyoshi Yamaura, Yutaka Ishikawa, Shinsuke Satoh, Tomoo Ushio, Kana Koike, Atsuya Uno. [doi]
- Runtime Composition of Iterations for Fusing Loop-carried Sparse DependenceKazem Cheshmi, Michelle Strout, Maryam Mehri Dehnavi. [doi]
- Choosing the Best Parallelization and Implementation Styles for Graph Analytics Codes: Lessons Learned from 1106 ProgramsYiqian Liu, Noushin Azami, Avery Vanausdal, Martin Burtscher. [doi]
- MBFGraph: An SSD-based External Graph System for Evolving GraphsChun-Yi Liu 0002, Wonil Choi, Soheil Khadirsharbiyani, Mahmut T. Kandemir. [doi]
- Rethinking Deployment for Serverless Functions: A Performance-First PerspectiveYiming Li, Laiping Zhao, Yanan Yang, Wenyu Qu. [doi]
- Frontier: Exploring ExascaleScott Atchley, Christopher Zimmer 0001, John Lange, David E. Bernholdt, Verónica G. Melesse Vergara, Thomas Beck, Michael J. Brim, Reuben D. Budiardja, Sunita Chandrasekaran, Markus Eisenbach 0002, Thomas M. Evans 0001, Matthew Ezell, Nicholas Frontiere, Antigoni Georgiadou, Joe Glenski, Philipp Grete, Steven P. Hamilton, John Holmen, Axel Huebl, Daniel A. Jacobson, Wayne Joubert, Kim Mcmahon, Elia Merzari, Stan G. Moore, Andrew Myers 0001, Stephen Nichols, Sarp Oral, Thomas Papatheodore, Danny Perez, David M. Rogers, Evan Schneider, Jean-Luc Vay, P.-K. Yeung. [doi]
- Graph3PO: A Temporal Graph Data Processing Method for Latency QoS Guarantee in Object Cloud Storage SystemWang Zhang, Zhan Shi 0001, Ziyi Liao, Yiling Li, Yu Du, Yutong Wu 0013, Fang Wang 0001, Dan Feng 0001. [doi]
- ANT-MOC: Scalable Neutral Particle Transport Using 3D Method of Characteristics on Multi-GPU SystemsShunde Li, Zongguo Wang, Lingkun Bu, Jue Wang, Zhikuang Xin, Shigang Li, Yangang Wang, Yangde Feng, Peng Shi, Yun Hu, Xuebin Chi. [doi]
- Cloud Computing to Enable Wearable-Driven Longitudinal Hemodynamic MapsCyrus Tanade, Emily Rakestraw, William Ladd, Erik W. Draeger, Amanda Randles. [doi]
- GreenNFV: Energy-Efficient Network Function Virtualization with Service Level Agreement ConstraintsMd. S. Q. Zulkar Nine, Tevfik Kosar, Muhammed Fatih Bulut, Jinho Hwang. [doi]
- Efficient Maximal Biclique Enumeration on GPUsZhe Pan, Shuibing He, Xu Li, Xuechen Zhang, Rui Wang, Gang Chen. [doi]
- High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor FormulationsMaciej Besta, Pawel Renc, Robert Gerstenberger, Paolo Sylos Labini, Alexandros Nikolaos Ziogas, Tiancheng Chen, Lukas Gianinazzi, Florian Scheidl, Kalman Szenes, Armon Carigiet, Patrick Iff, Grzegorz Kwasniewski, Raghavendra Kanakagiri, Chio Ge, Sammy Jaeger, Jaroslaw Was, Flavio Vella, Torsten Hoefler. [doi]
- FASDA: An FPGA-Aided, Scalable and Distributed Accelerator for Range-Limited Molecular DynamicsChunshu Wu, Tong Geng, Anqi Guo, Sahan Bandara, Pouya Haghi, Chuan Liu, Ang Li 0006, Martin C. Herbordt. [doi]
- VENOM: A Vectorized N: M Format for Unleashing the Power of Sparse Tensor CoresRoberto L. Castro, Andrei Ivanov, Diego Andrade, Tal Ben-Nun, Basilio B. Fraguela, Torsten Hoefler. [doi]
- Understanding the Effects of Permanent Faults in GPU's Parallelism Management and Control UnitsJuan-David Guerrero-Balaguera, Josie Esteban Rodriguez Condia, Fernando Fernandes dos Santos, Matteo Sonza Reorda, Paolo Rech. [doi]
- Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUsJames D. Trotter, Sinan Ekmekçibasi, Johannes Langguth, Tugba Torun, Emre Düzakin, Aleksandar Ilic, Didem Unat. [doi]
- Adaptive Workload-Balanced Scheduling Strategy for Global Ocean Data Assimilation on Massive GPUsJunmin Xiao, Chaoyang Shui, Di Cai, Kangyu Wang, Yunfei Pang, Mingyi Li, Hui Ma, Guangming Tan. [doi]
- A Quantitative Approach for Adopting Disaggregated Memory in HPC SystemsJacob Wahlgren, Gabin Schieffer, Maya B. Gokhale, Ivy Peng. [doi]
- High-Performance SVD Partial Spectrum ComputationDavid E. Keyes, Hatem Ltaief, Yuji Nakatsukasa, Dalal Sukkari. [doi]
- ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear SolversLinghao Song, Fan Chen 0001, Hai Li 0001, Yiran Chen 0001. [doi]
- Fine-grained Policy-driven I/O Sharing for Burst BuffersEd Karrels, Lei Huang 0019, Yuhong Kan, Ishank Arora, Yinzhi Wang, Daniel S. Katz, William Gropp, Zhao Zhang 0007. [doi]
- DPS: Adaptive Power Management for Overprovisioned SystemsJianru Ding, Henry Hoffmann. [doi]
- ADT-FSE: A New Encoder for SZTao Lu, Yu Zhong, Zibin Sun, Xiang Chen, You Zhou, Fei Wu, Ying Yang, Yunxin Huang, Yafei Yang. [doi]
- Interference-aware Multiplexing for Deep Learning in GPU Clusters: A Middleware ApproachWenyan Chen, Zizhao Mo, Huanle Xu, Kejiang Ye, Chengzhong Xu 0001. [doi]
- Unity ECC: Unified Memory Protection Against Bit and Chip ErrorsDongwhee Kim, Jaeyoon Lee, Wonyeong Jung, Michael B. Sullivan 0001, Jungrae Kim. [doi]
- Leveraging the Compute Power of Two HPC Systems for Higher-Dimensional Grid-Based Simulations with the Widely-Distributed Sparse Grid Combination TechniqueTheresa Pollinger, Alexander Van Craen, Christoph Niethammer, Marcel Breyer, Dirk Pflüger. [doi]
- Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data CentersMeng Wang, Jiajun Mao, Rajdeep Rana, John Bent, Serkay Olmez, Anjus George, Garrett Wilson Ransom, Jun Li, Haryadi S. Gunawi. [doi]
- High Throughput Training of Deep Surrogates from Large Ensemble RunsLucas Thibaut Meyer, Marc Schouler, Robert Alexander Caulk, Alejandro Ribés, Bruno Raffin. [doi]
- HPAC-Offload: Accelerating HPC Applications with Portable Approximate Computing on the GPUZane Fink, Konstantinos Parasyris, Giorgis Georgakoudis, Harshitha Menon. [doi]
- 5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway SupercomputerRongfen Lin, Xinhui Yuan, Wei Xue, Wanwang Yin, Jienan Yao, Junda Shi, Qiang Sun, Chaobo Song, Fei Wang. [doi]
- Co-design Hardware and Algorithm for Vector SearchWenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cédric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso. [doi]
- cuSZp: An Ultra-fast GPU Error-bounded Lossy Compression Framework with Optimized End-to-End PerformanceYafan Huang, Sheng Di, Xiaodong Yu 0001, Guanpeng Li, Franck Cappello. [doi]
- Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement LearningQiyang Ding, Pengfei Zheng, Shreyas Kudari, Shivaram Venkataraman, Zhao Zhang 0007. [doi]
- Breaking Boundaries: Distributed Domain Decomposition with Scalable Physics-Informed Neural PDE SolversArthur Feeney, Zitong Li, Ramin Bostanabad, Aparna Chandramowlishwaran. [doi]
- Scaling the "Memory Wall" for Multi-Dimensional Seismic Processing with Algebraic Compression on Cerebras CS-2 SystemsHatem Ltaief, Yuxi Hong, Leighton Wilson, Mathias Jacquelin, Matteo Ravasi, David Elliot Keyes. [doi]
- Experimental Evaluation of Xanadu X8 Photonic Quantum Computer: Error Measurement, Characterization and ImplicationsAditya Ranjan, Tirthak Patel, Harshitta Gandhi, Daniel Silver, William Cutler, Devesh Tiwari. [doi]
- Enhancing Adaptive Physics Refinement Simulations Through the Addition of Realistic Red Blood Cell CountsSayan Roychowdhury, Samreen T. Mahmud, Aristotle X. Martin, Peter Balogh, Daniel F. Puleri, John Gounley, Erik W. Draeger, Amanda Randles. [doi]
- Toward Exascale Computation for Turbomachinery FlowsYuhang Fu, Weiqi Shen, Jiahuan Cui, Yao Zheng, Guangwen Yang, Zhao Liu, Jifa Zhang, Tingwei Ji, Fangfang Xie, Xiaojing Lv, Hanyue Liu, Xu Liu, Xiyang Liu, Xiaoyu Song, Guocheng Tao, Yan Yan, Paul Tucker, Steven A. E. Miller, Shirui Luo, Seid Koric, Weimin Zheng. [doi]
- Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and ReplayKonstantinos Parasyris, Giorgis Georgakoudis, Esteban Rangel, Ignacio Laguna, Johannes Doerfert. [doi]
- FISCO-BCOS: An Enterprise-grade Permissioned Blockchain System with High-performanceHuizhong Li, Yujie Chen, Xiang Shi, Xingqiang Bai, Nan Mo, Wenlin Li, Rui Guo, Zhang Wang, Yi Sun. [doi]
- PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous SystemsXu Fu, Bingbin Zhang, Tengcheng Wang, Wenhao Li, Yuechen Lu, Enxin Yi, Jianqi Zhao, Xiaohan Geng, Fangying Li, Jingwen Zhang, Zhou Jin 0001, Weifeng Liu 0002. [doi]
- DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector MultiplicationYuechen Lu, Weifeng Liu 0002. [doi]
- I/O in WRF: A Case Study in Modern Parallel I/O TechniquesZanhua Huang, Kaiyuan Hou, Ankit Agrawal 0001, Alok N. Choudhary, Robert B. Ross, Wei-keng Liao. [doi]
- Scaling the Leading Accuracy of Deep Equivariant Models to Biomolecular Simulations of Realistic SizeBoris Kozinsky, Albert Musaelian, Anders Johansson, Simon L. Batzner. [doi]
- Exploring the Ultimate Regime of Turbulent Rayleigh-Bénard Convection Through Unprecedented Spectral-Element SimulationsNiclas Jansson, Martin Karp, Adalberto Perez, Timofey Mukha, Yi Ju, Jiahui Liu, Szilárd Páll, Erwin Laure, Tino Weinkauf, Jörg Schumacher, Philipp Schlatter, Stefano Markidis. [doi]
- Automated Mapping of Task-Based Programs onto Distributed and Heterogeneous MachinesThiago S. F. X. Teixeira, Alexandra Henzinger, Rohan Yadav, Alex Aiken. [doi]
- Structural Coding: A Low-Cost Scheme to Protect CNNs from Large-Granularity Memory FaultsAli Asgari Khoshouyeh, Florian Geissler, Syed Qutub, Michael Paulitsch, Prashant Nair, Karthik Pattabiraman. [doi]
- Calculon: a methodology and tool for high-level co-design of systems and large language modelsMikhail Isaev, Nic McDonald, Larry Dennison, Richard W. Vuduc. [doi]
- The Simple Cloud-Resolving E3SM Atmosphere Model Running on the Frontier Exascale SystemMark Taylor, Peter M. Caldwell, Luca Bertagna, Conrad Clevenger, Aaron Donahue, James G. Foucar, Oksana Guba, Benjamin R. Hillman, Noel Keen, Jayesh Krishna, Matthew R. Norman, Sarat Sreepathi, Christopher Terai, James B. White III, Andrew G. Salinger, Renata B. McCoy, Lai-yung Ruby Leung, David C. Bader, Danqing Wu. [doi]
- NNQS-Transformer: an Efficient and Scalable Neural Network Quantum States Approach for Ab initio Quantum ChemistryYangjun Wu, Chu Guo, Yi Fan, Pengyu Zhou, Honghui Shang. [doi]
- Portable and Scalable All-Electron Quantum Perturbation Simulations on Exascale SupercomputersZhikun Wu, Yangjun Wu, Ying Liu, Honghui Shang, Yingxiang Gao, Zhongcheng Zhang, Yuyang Zhang, Yingchi Long, Xiaobing Feng 0002, Huimin Cui. [doi]
- Large-Scale Materials Modeling at Quantum Accuracy: Ab Initio Simulations of Quasicrystals and Interacting Extended Defects in Metallic AlloysSambit Das, Bikash Kanungo, Vishal Subramanian, Gourab Panigrahi, Phani Motamarri, David M. Rogers, Paul Zimmerman, Vikram Gavini. [doi]
- Establishing a Modeling System in 3-km Horizontal Resolution for Global Atmospheric Circulation triggered by Submarine Volcanic Eruptions with 400 Billion Smoothed Particle HydrodynamicsShenghong Huang, Junshi Chen, Ziyu Zhang, Xiaoyu Hao, Jun Gu, Hong An, Chun Zhao, Yan Hu, Zhanming Wang, Longkui Chen, Yifan Luo, Jineng Yao, Yi Zhang, Yang Zhao, ZhiHao Wang, Dongning Jia, Zhao Jin, Changming Song, Xisheng Luo, Xiaobin He, Dexun Chen. [doi]
- Hanayo: Harnessing Wave-like Pipeline Parallelism for Enhanced Large Model Training EfficiencyZiming Liu, Shenggan Cheng, Haotian Zhou, Yang You 0001. [doi]
- Optimizing Reconfigurable Optical Datacenters: The Power of RandomizationMarcin Bienkowski, David Fuchssteiner, Stefan Schmid 0001. [doi]
- Experiences readying applications for ExascaleNicholas Malaya, Bronson Messer, Joseph Glenski, Antigoni Georgiadou, Justin Lietz, Kalyana C. Gottiparthi, Marc Day, Jackie Chen, Jon S. Rood, Lucas Esclapez, James B. White III, Gustav R. Jansen, Nicholas Curtis, Stephen Nichols, Jakub Kurzak, Noel Chalmers, Chip Freitag, Paul T. Bauman, Alessandro Fanfarillo, Reuben D. Budiardja, Thomas Papatheodore, Nicholas Frontiere, Damon McDougall, Matthew R. Norman, Sarat Sreepathi, Philip C. Roth, Dmytro Bykov, Noah Wolfe, Paul Mullowney, Markus Eisenbach 0002, Marc T. Henry de Frahan, Wayne Joubert. [doi]
- 69.7-PFlops Extreme Scale Earthquake Simulation with Crossing Multi-faults and Topography on SunwayWubing Wan, Lin Gan, Wenqiang Wang, Zekun Yin, Haodong Tian, Zhenguo Zhang, Yinuo Wang, Mengyuan Hua, Xiaohui Liu, Shengye Xiang, Zhongqiu He, Zijia Wang, Ping Gao 0005, Xiaohui Duan, Weiguo Liu, Wei Xue, Haohuan Fu, Guangwen Yang, Xiaofei Chen, Zeyu Song, Yaojian Chen, Xin Liu, Wei Zhang. [doi]
- Itoyori: Reconciling Global Address Space and Global Fork-Join Task ParallelismShumpei Shiina, Kenjiro Taura. [doi]
- Optimizing MPI Collectives on Shared Memory Multi-CoresJintao Peng, Jianbin Fang, Jie Liu 0002, Min Xie, Yi Dai, Bo Yang, Shengguo Li, Zheng Wang. [doi]
- BLAD: Adaptive Load Balanced Scheduling and Operator Overlap Pipeline For Accelerating The Dynamic GNN TrainingKaihua Fu, Quan Chen 0002, Yuzhuo Yang, Jiuchen Shi, Chao Li, Minyi Guo. [doi]
- Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU ClustersYang Liu 0179, Nan Ding 0006, Piyush Sao, Samuel Williams 0001, Xiaoye Sherry Li. [doi]
- Accelerating Communications in Federated Applications with Transparent Object ProxiesJ. Gregory Pauloski, Valérie Hayot-Sasson, Logan T. Ward, Nathaniel Hudson, Charlie Sabino, Matt Baughman, Kyle Chard, Ian T. Foster. [doi]
- Enabling Real World Scale Structural Superlubricity All-Atom Simulation on the Next-Generation Sunway SupercomputerXiaohui Duan, Jin Wang, Ping Gao 0005, Ming Ma, Lin Gan, Xin Liu, Haohuan Fu, Wei Xue, Dexun Chen, Guangwen Yang, Weiguo Liu. [doi]
- FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization BugsPhilipp Schaad, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Alexandros Nikolaos Ziogas, Torsten Hoefler. [doi]
- Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference ServiceBaolin Li, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari. [doi]
- Enhance the Strong Scaling of LAMMPS on FugakuJianxiong Li, Tong Zhao, Zhuoqiang Guo, Shunchen Shi, Lijun Liu, Guangming Tan, Weile Jia, Guojun Yuan, Zhan Wang. [doi]
- Optimizing Direct Convolutions on ARM Multi-CoresPengyu Wang, Weiling Yang, Jianbin Fang, Dezun Dong, Chun Huang, Peng Zhang, Tao Tang 0001, Zheng Wang. [doi]
- EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUsMingzhen Li, Wencong Xiao, Hailong Yang, Biao Sun, Hanyu Zhao, Shiru Ren, Zhongzhi Luan, Xianyan Jia, Yi Liu, Yong Li, Wei Lin, Depei Qian. [doi]
- Toward Sustainable HPC: Carbon Footprint Estimation and Environmental Implications of HPC SystemsBaolin Li, Rohan Basu Roy, Daniel Wang, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari. [doi]
- Application Performance Modeling via Tensor CompletionEdward Hutter, Edgar Solomonik. [doi]
- Automatic Generation of Distributed-Memory Mappings for Tensor ComputationsMartin Kong, Raneem Abu Yosef, Atanas Rountev, P. Sadayappan. [doi]
- Demystifying and Mitigating Cross-Layer Deficiencies of Soft Error Protection in Instruction DuplicationZhengyang He, Yafan Huang, Hui Xu 0009, Dingwen Tao, Guanpeng Li. [doi]
- HEAR: Homomorphically Encrypted AllreduceMarcin Chrapek, Mikhail Khalilov, Torsten Hoefler. [doi]
- Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPULuk Burchard, Max Xiaohang Zhao, Johannes Langguth, Aydin Buluç, Giulia Guidi. [doi]
- DGAP: Efficient Dynamic Graph Analysis on Persistent MemoryAbdullah Al Raqibul Islam, Dong Dai 0001. [doi]
- Legate Sparse: Distributed Sparse Computing in PythonRohan Yadav, Wonchan Lee, Melih Elibol, Manolis Papadakis, Taylor Lee Patti, Michael Garland, Alex Aiken, Fredrik Kjolstad, Michael Bauer 0001. [doi]
- Embracing Irregular Parallelism in HPC with YGMTrevor Steil, Tahsin Reza, Benjamin Priest, Roger Pearce. [doi]
- GraphSet: High Performance Graph Mining through Equivalent Set TransformationsTianhui Shi, Jidong Zhai, Haojie Wang, Qiqian Chen, Mingshu Zhai, Zixu Hao, Haoyu Yang, Wenguang Chen. [doi]
- Rapid simulations of atmospheric data assimilation of hourly-scale phenomena with modern neural networksYiyuan Li, Xiting Ju, Yi Xiao, Qilong Jia, Yongxiao Zhou, Simeng Qian, Rongfen Lin, Bin Yang, Shupeng Shi, Xin Liu, Jie Gao, Zhen Wang, Sha Liu, Jian Tan, Xuan Wang, Zhengding Hu, Limin Yan, Wei Xue. [doi]
- Large-Scale Simulation of Structural Dynamics Computing on GPU ClustersYumeng Shi, Ningming Nie, Jue Wang 0013, Kehao Lin, Chunbao Zhou, Shigang Li 0002, Kehan Yao, Shunde Li, Yangde Feng, Yan Zeng, Fang Liu, Yangang Wang, Yue Gao. [doi]
- Data Flow Lifecycles for Optimizing Workflow CoordinationHyungro Lee, Luanzheng Guo, Meng Tang, Jesun Firoz, Nathan Tallent, Anthony Kougkas, Xian-He Sun. [doi]
- Mitigating Coupling Map Constrained Correlated Measurement Errors on Quantum DevicesAlan Robertson, Shuaiwen Song. [doi]
- PeeK: A Prune-Centric Approach for K Shortest Path ComputationWang Feng, Shiyang Chen, Hang Liu, Yuede Ji. [doi]
- Xfast: Extreme File Attribute Stat Acceleration for LustreYingjin Qian, Wen Cheng, Lingfang Zeng, Xi Li, Marc-André Vef, Andreas Dilger, Siyao Lai, Shuichi Ihara, Yong Fan, André Brinkmann. [doi]
- The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of CoresMaciej Besta, Robert Gerstenberger, Marc Fischer, Michal Podstawski, Nils Blach, Berke Egeli, George Mitenkov, Wojciech Chlapek, Marek T. Michalewicz, Hubert Niewiadomski, Jürgen Müller, Torsten Hoefler. [doi]
- A GPU Algorithm for Detecting Strongly Connected ComponentsGhadeer Alabandi, William Sands, George Biros, Martin Burtscher. [doi]
- TANGO: re-thinking quantization for graph neural network training on GPUsShiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu. [doi]
- Phases, Modalities, Spatial and Temporal Locality: Domain Specific ML Prefetcher for Accelerating Graph AnalyticsPengmiao Zhang, Rajgopal Kannan, Viktor K. Prasanna. [doi]
- DistTGL: Distributed Memory-Based Temporal Graph Neural Network TrainingHongkuan Zhou, Da Zheng, Xiang Song, George Karypis, Viktor K. Prasanna. [doi]
- GRAPHINE: Enhanced Neutral Atom Quantum Computing using Application-Specific Rydberg Atom ArrangementTirthak Patel, Daniel Silver, Devesh Tiwari. [doi]
- Exascale Multiphysics Nuclear Reactor Simulations for Advanced DesignsElia Merzari, Steven P. Hamilton, Thomas M. Evans 0001, Misun Min, Paul F. Fischer, Stefan Kerkemeier, Jun Fang, Paul Romano, Yu-Hsiang Lan, Malachi Phillips, Elliott Biondo, Katherine Royston, Tim Warburton, Noel Chalmers, Thilina Rathnayake. [doi]
- TrivialSpy: Identifying Software Triviality via Fine-grained and Dataflow-based Value ProfilingXin You, Hailong Yang, Kelun Lei, Zhongzhi Luan, Depei Qian. [doi]
- Parallel Top-K Algorithms on GPU: A Comprehensive Study and New MethodsJingrong Zhang, Akira Naruse, Xipeng Li, Yong Wang. [doi]
- Prodigy: Towards Unsupervised Anomaly Detection in Production HPC SystemsBurak Aksar, Efe Sencan, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Brian Kulis, Manuel Egele, Ayse K. Coskun. [doi]
- Optimizing High-Performance Linpack for Exascale Accelerated ArchitecturesNoel Chalmers, Jakub Kurzak, Damon McDougall, Paul T. Bauman. [doi]
- AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement ApplicationsDaoce Wang, Jesus Pulido, Pascal Grosset, Jiannan Tian, Sian Jin, Houjun Tang, Jean M. Sexton, Sheng Di, Kai Zhao 0008, Bo Fang, Zarija Lukic, Franck Cappello, James P. Ahrens, Dingwen Tao. [doi]