Abstract is missing.
- Achieving Sub-second Pairwise Query over Evolving GraphsHongtao Chen, Mingxing Zhang, Ke Yang, Kang Chen, Albert Y. Zomaya, Yongwei Wu, Xuehai Qian. 1-15 [doi]
- AfterImage: Leaking Control Flow Data and Tracking Load Operations via the Hardware PrefetcherYun Chen, Lingfeng Pei, Trevor E. Carlson. 16-32 [doi]
- A Generic Service to Provide In-Network Aggregation for Key-Value StreamsYongchao He, Wenfei Wu, Yanfang Le, Ming Liu, ChonLam Lao. 33-47 [doi]
- A Prediction System ServiceZhizhou Zhang 0002, Alvin Oliver Glova, Timothy Sherwood, Jonathan Balkind. 48-60 [doi]
- AtoMig: Automatically Migrating Millions Lines of Code from TSO to WMMMartin Beck, Koustubha Bhat, Lazar Stricevic, Geng Chen, Diogo Behrens, Ming Fu, Viktor Vafeiadis, Haibo Chen 0001, Hermann Härtig. 61-73 [doi]
- BeeHive: Sub-second Elasticity for Web Services with Semi-FaaS ExecutionZiming Zhao 0003, Mingyu Wu 0001, Jiawei Tang, Binyu Zang, Zhaoguo Wang, Haibo Chen 0001. 74-87 [doi]
- Better Than Worst-Case Decoding for Quantum Error CorrectionGokul Subramanian Ravi, Jonathan M. Baker, Arash Fayyazi, Sophia Fuhui Lin, Ali Javadi-Abhari, Massoud Pedram, Frederic T. Chong. 88-102 [doi]
- Betty: Enabling Large-Scale GNN Training with Batch-Level Graph PartitioningShuangyan Yang, Minjia Zhang, Wenqian Dong, Dong Li 0001. 103-117 [doi]
- Carbon Explorer: A Holistic Framework for Designing Carbon Aware DatacentersBilge Acun, Benjamin Lee, Fiodar Kazhamiaka, Kiwan Maeng, Udit Gupta, Manoj Chakkaravarthy, David Brooks 0001, Carole-Jean Wu. 118-132 [doi]
- CommonGraph: Graph Analytics on Evolving DataMahbod Afarin, Chao Gao, Shafiur Rahman, Nael B. Abu-Ghazaleh, Rajiv Gupta 0001. 133-145 [doi]
- Compilation Consistency Modulo Debug InformationTheodore Luo Wang, Yongqiang Tian 0001, Yiwen Dong 0002, ZhenYang Xu, Chengnian Sun. 146-158 [doi]
- Compiling Distributed System Models with PGoFinn Hackett, Shayan Hosseini, Renato Costa, Matthew Do, Ivan Beschastnikh. 159-175 [doi]
- Copy-on-Pin: The Missing Piece for Correct Copy-on-WriteDavid Hildenbrand, Martin Schulz 0001, Nadav Amit. 176-191 [doi]
- Decker: Attack Surface Reduction via On-Demand Code MappingChris Porter, Sharjeel Khan, Santosh Pande. 192-206 [doi]
- DeepUM: Tensor Migration and Prefetching in Unified MemoryJaehoon Jung, Jinpyo Kim, Jaejin Lee. 207-221 [doi]
- Ditto: End-to-End Application Cloning for Networked Cloud ServicesMingyu Liang, Yu Gan 0002, Yueying Li, Carlos Torres, Abhishek Dhanotia, Mahesh Ketkar, Christina Delimitrou. 222-236 [doi]
- DPACS: Hardware Accelerated Dynamic Neural Network Pruning through Algorithm-Architecture Co-designYizhao Gao, Baoheng Zhang, Xiaojuan Qi, Hayden Kwok-Hay So. 237-251 [doi]
- Ecovisor: A Virtual Energy System for Carbon-Efficient ApplicationsAbel Souza, Noman Bashir, Jorge Murillo, Walid A. Hanafy, Qianlin Liang, David E. Irwin 0001, Prashant J. Shenoy. 252-265 [doi]
- ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep LearningDiandian Gu, Yihao Zhao, Yinmin Zhong, Yifan Xiong, Zhenhua Han, Peng Cheng, Fan Yang, Gang Huang, Xin Jin, Xuanzhe Liu. 266-280 [doi]
- EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation SystemsDaniar Heri Kurniawan, Ruipu Wang, Kahfi S. Zulkifli, Fandi A. Wiranata, John Bent, Ymir Vigfusson, Haryadi S. Gunawi. 281-294 [doi]
- FLAT: An Optimized Dataflow for Mitigating Attention BottlenecksSheng-Chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna. 295-310 [doi]
- FrozenQubits: Boosting Fidelity of QAOA by Skipping Hotspot NodesRamin Ayanzadeh, Narges Alavisamani, Poulami Das 0005, Moinuddin K. Qureshi. 311-324 [doi]
- GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System ArchitectureZaid Qureshi, Vikram Sharma Mailthody, Isaac Gelado, Seungwon Min, Amna Masood, Jeongmin Park, Jinjun Xiong, C. J. Newburn, Dmitri Vainbrand, I-Hsin Chung, Michael Garland, William J. Dally, Wen-mei W. Hwu. 325-339 [doi]
- GZKP: A GPU Accelerated Zero-Knowledge Proof SystemWeiliang Ma, Qian Xiong, Xuanhua Shi, Xiaosong Ma, Hai Jin 0001, Haozhao Kuang, Mingyu Gao 0001, Ye Zhang, Haichen Shen, Weifang Hu. 340-353 [doi]
- Hacky Racers: Exploiting Instruction-Level Parallelism to Generate Stealthy Fine-Grained TimersHaocheng Xiao, Sam Ainsworth 0001. 354-369 [doi]
- Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor ProgramsYaoyao Ding, Cody Hao Yu, Bojian Zheng, Yizhi Liu, Yida Wang, Gennady Pekhimenko. 370-384 [doi]
- HuffDuff: Stealing Pruned DNNs from Sparse AcceleratorsDingqing Yang, Prashant J. Nair, Mieszko Lis. 385-399 [doi]
- Junkyard Computing: Repurposing Discarded Smartphones to Minimize CarbonJennifer Switzer, Gabriel Marcano, Ryan Kastner, Pat Pannuto. 400-412 [doi]
- Khuzdul: Efficient and Scalable Distributed Graph Pattern Mining EngineJingji Chen, Xuehai Qian. 413-426 [doi]
- KIT: Testing OS-Level Virtualization for Functional Interference BugsCongyu Liu, Sishuai Gong, Pedro Fonseca 0001. 427-441 [doi]
- LeaFTL: A Learning-Based Flash Translation Layer for Solid-State DrivesJinghan Sun, Shaobo Li, YunXin Sun, Chao Sun, Dejan Vucinic, Jian Huang 0006. 442-456 [doi]
- Lucid: A Non-intrusive, Scalable and Interpretable Scheduler for Deep Learning Training JobsQinghao Hu, Meng Zhang, Peng Sun 0006, Yonggang Wen 0001, Tianwei Zhang 0004. 457-472 [doi]
- MC Mutants: Evaluating and Improving Testing for Memory Consistency SpecificationsReese Levine, Tianhao Guo, Mingun Cho, Alan Baker, Raph Levien, David Neto, Andrew Quinn 0001, Tyler Sorensen 0002. 473-488 [doi]
- Mobius: Fine Tuning Large-Scale Models on Commodity GPU ServersYangyang Feng, Minhui Xie, Zijie Tian, Shuo Wang, Youyou Lu, Jiwu Shu. 489-501 [doi]
- MSCCLang: Microsoft Collective Communication LanguageMeghan Cowan, Saeed Maleki, Madanlal Musuvathi, Olli Saarikivi, Yifan Xiong. 502-514 [doi]
- Navigating the Dynamic Noise Landscape of Variational Quantum Algorithms with QISMETGokul Subramanian Ravi, Kaitlin N. Smith, Jonathan M. Baker, Tejas Kannan, Nathan Earnest, Ali Javadi-Abhari, Henry Hoffmann, Frederic T. Chong. 515-529 [doi]
- NNSmith: Generating Diverse and Valid Test Cases for Deep Learning CompilersJiawei Liu, Jinkun Lin, Fabian Ruffy, Cheng Tan, Jinyang Li, Aurojit Panda, Lingming Zhang. 530-543 [doi]
- NUBA: Non-Uniform Bandwidth GPUsXia Zhao 0004, Magnus Jahre, Yuhua Tang, Guangda Zhang, Lieven Eeckhout. 544-559 [doi]
- Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication CompressionJaeyong Song, Jinkyu Yim, Jaewon Jung, Hongsun Jang, Hyung Jin Kim, Youngsok Kim, Jinho Lee. 560-573 [doi]
- Pond: CXL-Based Memory Pooling Systems for Cloud PlatformsHuaicheng Li, Daniel S. Berger, Lisa Hsu, Daniel Ernst, Pantea Zardoshti, Stanko Novakovic, Monish Shah, Samir Rajadnya, Scott Lee, Ishwar Agarwal, Mark D. Hill, Marcus Fontoura, Ricardo Bianchini. 574-587 [doi]
- Prism: Optimizing Key-Value Store for Modern Heterogeneous Storage DevicesYongju Song, Wook-Hee Kim, Sumit Kumar Monga, Changwoo Min, Young Ik Eom. 588-602 [doi]
- Probabilistic Concurrency Testing for Weak Memory ProgramsMingyu Gao 0006, Soham Chakraborty 0001, Burcu Kulahcioglu Ozkan. 603-616 [doi]
- Propeller: A Profile Guided, Relinking Optimizer for Warehouse-Scale ApplicationsHan Shen, Krzysztof Pszeniczny, Rahman Lavaee, Snehasish Kumar, Sriraman Tallam, Xinliang David Li. 617-631 [doi]
- Protecting Data Integrity of Web Applications with Database Constraints Inferred from Application CodeHaochen Huang, Bingyu Shen 0002, Li Zhong, Yuanyuan Zhou 0001. 632-645 [doi]
- Qompress: Efficient Compilation for Ququarts Exploiting Partial and Mixed Radix Operations for Communication ReductionAndrew Litteken, Lennart Maximilian Seifert, Jason Chadwick, Natalia Nottingham, Frederic T. Chong, Jonathan M. Baker. 646-659 [doi]
- RAIZN: Redundant Array of Independent Zoned NamespacesThomas Kim, Jekyeom Jeon, Nikhil Arora, Huaicheng Li, Michael Kaminsky, David G. Andersen, Gregory R. Ganger, George Amvrosiadis, Matias Bjørling. 660-673 [doi]
- Revisiting Log-Structured Merging for KV Stores in Hybrid Memory SystemsZhuohui Duan, Jiabo Yao, Haikun Liu, Xiaofei Liao, Hai Jin 0001, Yu Zhang 0027. 674-687 [doi]
- Scoped Buffered Persistency Model for GPUsShweta Pandey, Aditya K. Kamath, Arkaprava Basu. 688-701 [doi]
- ShakeFlow: Functional Hardware Description with Latency-Insensitive Interface CombinatorsSungsoo Han, Minseong Jang, Jeehoon Kang. 702-717 [doi]
- Sigma: Compiling Einstein Summations to Locality-Aware DataflowTian Zhao 0001, Alexander Rucker, Kunle Olukotun. 718-732 [doi]
- SMAPPIC: Scalable Multi-FPGA Architecture Prototype Platform in the CloudGrigory Chirkov, David Wentzlaff. 733-746 [doi]
- Spada: Accelerating Sparse Matrix Multiplication with Adaptive DataflowZhiyao Li, Jiaxiang Li, Taijie Chen, Dimin Niu, Hongzhong Zheng, Yuan Xie, Mingyu Gao 0001. 747-761 [doi]
- SpecPMT: Speculative Logging for Resolving Crash Consistency Overhead of Persistent MemoryChencheng Ye, Yuanchao Xu 0001, Xipeng Shen, Yan Sha, Xiaofei Liao, Hai Jin 0001, Yan Solihin. 762-777 [doi]
- Stepwise Debugging for Hardware AcceleratorsGriffin Berlstein, Rachit Nigam, Christophe Gyurgyik, Adrian Sampson. 778-790 [doi]
- STI: Turbocharge NLP Inference at the Edge via Elastic PipeliningLiwei Guo, Wonkyo Choe, Felix Xiaozhu Lin. 791-803 [doi]
- TensorIR: An Abstraction for Automatic Tensorized Program OptimizationSiyuan Feng, Bohan Hou, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu 0001, TianQi Chen. 804-817 [doi]
- TiLT: A Time-Centric Approach for Stream Query Optimization and ParallelizationAnand Jayarajan, Wei Zhao, Yudi Sun, Gennady Pekhimenko. 818-832 [doi]
- TLP: A Deep Learning-Based Cost Model for Tensor Program TuningYi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang. 833-845 [doi]
- Towards a Machine Learning-Assisted Kernel with LAKEHenrique Fingler, Isha Tarte, Hangchen Yu, Ariel Szekely, Bodun Hu, Aditya Akella, Christopher J. Rossbach. 846-861 [doi]
- uBFT: Microsecond-Scale BFT using Disaggregated MemoryMarcos K. Aguilera, Naama Ben-David, Rachid Guerraoui, Antoine Murat, Athanasios Xygkis, Igor Zablotchi. 862-877 [doi]
- uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural NetworksYangjie Zhou 0001, Jingwen Leng, Yaoxu Song, Shuwen Lu, Mian Wang, Chao Li, Minyi Guo, Wenting Shen, Yong Li, Wei Lin, Xiangwen Liu, Hanqing Wu. 878-891 [doi]
- VClinic: A Portable and Efficient Framework for Fine-Grained Value ProfilersXin You, Hailong Yang, Kelun Lei, Zhongzhi Luan, Depei Qian. 892-904 [doi]
- VDom: Fast and Unlimited Virtual Domains on Multiple ArchitecturesZiqi Yuan, Siyu Hong, Rui Chang, Yajin Zhou, Wenbo Shen, Kui Ren 0001. 905-919 [doi]
- WACO: Learning Workload-Aware Co-optimization of the Format and Schedule of a Sparse Tensor ProgramJaeyeon Won, Charith Mendis, Joel S. Emer, Saman P. Amarasinghe. 920-934 [doi]
- Where Did My Variable Go? Poking Holes in Incomplete Debug InformationCristian Assaiante, Daniele Cono D'Elia, Giuseppe Antonio Di Luna, Leonardo Querzoni. 935-947 [doi]