Abstract is missing.
- Basilisk: Using Provenance Invariants to Automate Proofs of Undecidable ProtocolsTony Nuda Zhang, Keshav Singh, Tej Chajed, Manos Kapritsos, Bryan Parno. 1-17 [doi]
- Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed SystemsChang Lou, Dimas Shidqi Parikesit, Yujin Huang, Zhewen Yang, Senapati Diwangkara, Yuzhuo Jing, Achmad Imam Kistijantoro, Ding Yuan 0004, Suman Nath, Peng Huang 0005. 19-38 [doi]
- Mirage: A Multi-Level Superoptimizer for Tensor ProgramsMengdi Wu, Xinhao Cheng, Shengyu Liu, Chunan Shi, Jianan Ji, Man Kit Ao, Praveen Velliengiri, Xupeng Miao, Oded Padon, Zhihao Jia. 21-38 [doi]
- Picsou: Enabling Replicated State Machines to Communicate EfficientlyReginald Frank, Micah Murray, Chawinphat Tankuranand, Junseo Yoo, Ethan Xu, Natacha Crooks, Suyash Gupta 0001, Manos Kapritsos. 39-56 [doi]
- FineMem: Breaking the Allocation Overhead vs. Memory Waste Dilemma in Fine-Grained Disaggregated Memory ManagementXiaoyang Wang, Yongkun Li 0001, Kan Wu, Wenzhe Zhu, Yuqi Li, Yinlong Xu. 57-74 [doi]
- To PRI or Not To PRI, That's the questionYun Wang, Liang Chen, Jie Ji, Xianting Tian, Ben Luo, Zhixiang Wei, Zhibai Huang, Kailiang Xu, Kaihuan Peng, Kaijie Guo, Ning Luo, Guangjian Wang, Shengdong Dai, Yibin Shen, Jiesheng Wu, Zhengwei Qi. 75-89 [doi]
- Enabling Efficient GPU Communication over Multiple NICs with FuseLinkZhenghang Ren, Yuxuan Li, Zilong Wang 0007, Xinyang Huang, Wenxue Li, Kaiqiang Xu, Xudong Liao, Yijun Sun, Bowen Liu, Han Tian, Junxue Zhang 0001, Mingfei Wang, Zhizhen Zhong, Guyue Liu, Ying Zhang 0022, Kai Chen 0005. 91-108 [doi]
- Tigon: A Distributed Database for a CXL PodYibo Huang 0006, Haowei Chen, Newton Ni, Yan Sun, Vijay Chidambaram, Dixin Tang, Emmett Witchel. 109-128 [doi]
- Mako: Speculative Distributed Transactions with Geo-ReplicationWeihai Shen, Yang Cui, Siddhartha Sen 0001, Sebastian Angel, Shuai Mu 0001. 129-152 [doi]
- Quake: Adaptive Indexing for Vector SearchJason Mohoney, Devesh Sarda, Mengze Tang, Shihabur Rahman Chowdhury, Anil Pacaci, Ihab F. Ilyas, Theodoros Rekatsinas, Shivaram Venkataraman. 153-169 [doi]
- Achieving Low-Latency Graph-Based Vector Search via Aligning Best-First Search Algorithm with SSDHao Guo, Youyou Lu. 171-186 [doi]
- Skybridge: Bounded Staleness for Distributed CachesRobert Lyerly, Scott Pruett, Kevin Doherty, Greg Rogers, Nathan Bronson, John Hugg. 187-204 [doi]
- Fork in the Road: Reflections and Optimizations for Cold Start Latency in Production Serverless SystemsXiaohu Chai, Tianyu Zhou, Keyang Hu, Jianfeng Tan, Tiwei Bie, Anqi Shen, Dawei Shen, Qi Xing, Shun Song, Tongkai Yang, Le Gao, Feng Yu, Zhengyu He, Dong Du 0003, Yubin Xia, Kang Chen, Yu Chen. 199-218 [doi]
- KPerfIR: Towards a Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI WorkloadsYue Guan, Yuanwei Fang, Keren Zhou 0001, Corbin Robeck, Manman Ren, Zhongkai Yu, Yufei Ding 0001, Adnan Aziz. 205-220 [doi]
- QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic ApproachShouyang Dong, Jun Bi, Di Huang, Jiaming Guo, Jianxing Xu, Ruibai Xu, Xinkai Song, Yifan Hao 0001, Ling Li 0001, Xuehai Zhou, Tianshi Chen 0002, Qi Guo 0001, Yunji Chen. 239-255 [doi]
- WaferLLM: Large Language Model Inference at Wafer ScaleCongjie He, Yeqi Huang, Pei Mu 0003, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang 0024, Luo Mai. 257-273 [doi]
- BlitzScale: Fast and Live Large Model Autoscaling with O(1) Host CachingDingyan Zhang, Haotian Wang, Yang Liu, Xingda Wei, Yizhou Shan, Rong Chen 0001, Haibo Chen 0001. 275-293 [doi]
- Bayesian Code Diffusion for Efficient Automatic Deep Learning Program OptimizationIsu Jeong, Seulki Lee 0002. 295-311 [doi]
- Training with Confidence: Catching Silent Errors in Deep Learning Training with Automated Proactive ChecksYuxuan Jiang, Ziming Zhou, Boyu Xu, Beijie Liu, Runhui Xu, Peng Huang. 313-329 [doi]
- Neutrino: Fine-grained GPU Kernel Profiling via Programmable ProbingSonglin Huang, Chenshu Wu. 331-355 [doi]
- Principles and Methodologies for Serial Performance OptimizationSujin Park, Mingyu Guan, Xiang Cheng, Taesoo Kim. 357-373 [doi]
- Söze: One Network Telemetry Is All You Need for Per-flow Weighted Bandwidth Allocation at ScaleWeitao Wang, T. S. Eugene Ng. 375-392 [doi]
- Decouple and Decompose: Scaling Resource Allocation with DeDeZhiying Xu, Minlan Yu, Francis Y. Yan. 393-409 [doi]
- Quantum Virtual MachinesRunzhou Tao, Hongzheng Zhu, Jason Nieh, Jianan Yao, Ronghui Gu. 411-428 [doi]
- QOS: Quantum Operating SystemEmmanouil Giortamis, Francisco Romão, Nathaniel Tornow, Pramod Bhatotia. 429-447 [doi]
- Scalio: Scaling up DPU-based JBOF Key-value Store with NVMe-oF Target OffloadXun Sun, Mingxing Zhang, Yingdi Shan, Kang Chen, Jinlei Jiang, Yongwei Wu 0001. 449-464 [doi]
- Low End-to-End Latency atop a Speculative Shared Log with Fix-Ante OrderingShreesha G. Bhat, Tony Hong, Xuhao Luo, Jiyu Hu, Aishwarya Ganesan, Ramnatthan Alagappan. 465-481 [doi]
- Understanding Stragglers in Large Model Training Using What-if AnalysisJinkun Lin, Ziheng Jiang, Zuquan Song, Sida Zhao, Menghan Yu, Zhanghan Wang, Chenyuan Wang, Zuocheng Shi, Xiang Shi, Wei Jia, Zherui Liu, Shuguang Wang, Haibin Lin, Xin Liu 0086, Aurojit Panda, Jinyang Li. 483-498 [doi]
- Kamino: Efficient VM Allocation at Scale with Latency-Driven Cache-Aware SchedulingDavid Domingo, Hugo Barbalho, Marco Molinaro 0004, Kuan Liu, Abhisek Pan, David Dion, Thomas Moscibroda, Sudarsun Kannan, Ishai Menache. 519-535 [doi]
- ZEN: Empowering Distributed Training with Sparsity-driven Data SynchronizationZhuang Wang, Zhaozhuo Xu, Jingyi Xi, Yuke Wang, Anshumali Shrivastava, T. S. Eugene Ng. 537-556 [doi]
- Extending Applications Safely and EfficientlyYusheng Zheng, Tong Yu, Yiwei Yang 0002, Yanpeng Hu, Xiaozheng Lai, Dan Williams 0001, Andi Quinn. 557-574 [doi]
- Tintin: A Unified Hardware Performance Profiling Infrastructure to Uncover and Manage UncertaintyAo Li 0006, Marion Sudvarg, Zihan Li, Sanjoy K. Baruah, Chris Gill 0001, Ning Zhang 0017. 575-593 [doi]
- Building Bridges: Safe Interactions with Foreign Languages through OmniglotLeon Schuermann, Jack Toubes, Tyler Potyondy, Pat Pannuto, Mae Milano, Amit Levy. 595-613 [doi]
- KRR: Efficient and Scalable Kernel Record ReplayTianren Zhang, Sishuai Gong, Pedro Fonseca 0001. 615-632 [doi]
- Deterministic Client: Enforcing Determinism on Untrusted Machine CodeZachary Yedidia, Geoffrey Ramseyer, David Mazières. 633-649 [doi]
- Disentangling the Dual Role of NIC Receive RingsBoris Pismenny, Adam Morrison 0001, Dan Tsafrir. 651-669 [doi]
- XSched: Preemptive Scheduling for Diverse XPUsWeihang Shen, Mingcong Han, Jialong Liu, Rong Chen 0001, Haibo Chen 0001. 671-692 [doi]
- OS Rendering Service Made Parallel with Out-of-Order Execution and In-Order CommitYuanpei Wu, Chao Xu, Yubin Xia, Yang Yu, Ming Fu, Binyu Zang, Haibo Chen 0001. 693-710 [doi]
- EMT: An OS Framework for New Memory Translation ArchitecturesSiyuan Chai, Jiyuan Zhang 0003, Jongyul Kim 0001, Alan Wang, Fan Chung, Jovan Stojkovic, Weiwei Jia 0001, Dimitrios Skarlatos 0002, Josep Torrellas, Tianyin Xu. 711-729 [doi]
- Tiered Memory Management Beyond HotnessJinshu Liu, Hamid Hadian, Hanchen Xu, Huaicheng Li. 731-747 [doi]
- NanoFlow: Towards Optimal Large Language Model Serving ThroughputKan Zhu, Yufei Gao, Yilong Zhao, Liangyu Zhao, Gefei Zuo, Yile Gu, Dedong Xie, Zihao Ye 0001, Keisuke Kamahori, Chien-Yu Lin, Ziren Wang, Stephanie Wang, Arvind Krishnamurthy, Baris Kasikci. 749-765 [doi]
- PipeThreader: Software-Defined Pipelining for Efficient DNN ExecutionYu Cheng, Lei Wang, Yining Shi 0001, Yuqing Xia, Lingxiao Ma, Jilong Xue, Yang Wang, Zhiwen Mo, Feiyang Chen, Fan Yang 0024, Mao Yang 0004, Zhi Yang 0001. 767-783 [doi]
- WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model TrainingZheng Wang 0075, Anna Cai, Xinfeng Xie, Zaifeng Pan, Yue Guan, Weiwei Chu, Jie Wang 0022, Shikai Li, Jianyu Huang, Chris Cai, Yuchen Hao, Yufei Ding 0001. 785-801 [doi]
- DecDEC: A Systems Approach to Advancing Low-Bit LLM QuantizationYeonhong Park, Jake Hyun, Hojoon Kim, Jae W. Lee. 803-819 [doi]
- Stripeless Data Placement for Erasure-Coded In-Memory StorageJian Gao, Jiwu Shu, Bin Yan, Yuhao Zhang 0006, Keji Huang. 821-838 [doi]
- PoWER Never Corrupts: Tool-Agnostic Verification of Crash Consistency and Corruption DetectionHayley LeBlanc, Jacob R. Lorch, Chris Hawblitzel, Cheng Huang, Yiheng Tao, Nickolai Zeldovich, Vijay Chidambaram. 839-857 [doi]
- Fast and Synchronous Crash Consistency with Metadata Write-Once File SystemYanqi Pan, Wen Xia, Yifeng Zhang, Xiangyu Zou, Hao Huang, Zhenhua Li 0001, Chentao Wu. 859-878 [doi]
- Decentralized, Epoch-based F2FS Journaling with Fine-grained Crash RecoveryYaotian Cui, Zhiqi Wang, Renhai Chen, Zili Shao. 879-895 [doi]
- Okapi: Decoupling Data Striping and Redundancy Grouping in Cluster File SystemsSanjith Athlur, Timothy Kim, Saurabh Kadekodi, Francisco Maturana, Xavier Ramos, Arif Merchant, K. V. Rashmi, Gregory R. Ganger. 897-914 [doi]
- Compass: Encrypted Semantic Search with High AccuracyJinhao Zhu, Liana Patel, Matei Zaharia, Raluca Ada Popa. 915-938 [doi]
- Weave: Efficient and Expressive Oblivious Analytics at ScaleMahdi Soleimani, Grace Jia, Anurag Khandelwal. 939-955 [doi]
- Paralegal: Practical Static Analysis for Privacy BugsJustus Adam, Carolyn Zech, Livia Zhu, Sreshtaa Rajesh, Nathan Harbison, Mithi Jethwa, Will Crichton, Shriram Krishnamurthi, Malte Schwarzkopf. 957-978 [doi]
- MettEagle: Costs and Benefits of Implementing Containers on MicrokernelsTill Miemietz, Viktor Reusch, Matthias Hille, Lars Wrenger, Jana Eisoldt, Jan Klötzke, Max Kurze, Adam Lackorzynski, Michael Roitzsch, Hermann Härtig. 979-996 [doi]