Abstract is missing.
- PRED: Performance-oriented Random Early Detection for Consistently Stable Performance in DatacentersXinle Du, Tong Li 0014, Guangmeng Zhou, Zhuotao Liu, Hanlin Huang, Xiangyu Gao, Mowei Wang, Kun Tan, Ke Xu 0002. 1-20 [doi]
- Rajomon: Decentralized and Coordinated Overload Control for Latency-Sensitive MicroservicesJiali Xing, Akis Giannoukos, Paul Loh, Shuyue Wang, Justin Qiu, Henri Maxime Demoulin, Konstantinos Kallas, Benjamin C. Lee. 21-36 [doi]
- Learnings from Deploying Network QoS Alignment to Application Priorities for Storage ServicesMatthew Buckley, Parsa Pazhooheshy, Z. Morley Mao, Nandita Dukkipati, Hamid Hajabdolali Bazzaz, Priyaranjan Jha, Yingjie Bi, Steve Middlekauff, Yashar Ganjali. 37-53 [doi]
- DISC: Backpressure Mitigation In Multi-tier Applications With Distributed Shared ConnectionBrice Ekane, Djob Mvondo, Renaud Lachaize, Yérom-David Bromberg, Alain Tchana, Daniel Hagimont. 55-70 [doi]
- Enabling Silent Telemetry Data Transmission with InvisiFlowYinda Zhang 0002, Liangcheng Yu, Gianni Antichi, Ran Ben-Basat, Vincent Liu 0001. 71-86 [doi]
- Unlocking ECMP Programmability for Precise Traffic ControlYadong Liu, Yunming Xiao, Xuan Zhang, Weizhen Dang, Huihui Liu, Xiang Li, Zekun He, Jilong Wang 0001, Aleksandar Kuzmanovic, Ang Chen 0001, Congcong Miao. 87-106 [doi]
- Enabling Portable and High-Performance SmartNIC Programs with AlkaliJiaxin Lin, Zhiyuan Guo, Mihir Shah, Tao Ji, Yiying Zhang 0005, Daehyeok Kim, Aditya Akella. 107-126 [doi]
- Scaling IP Lookup to Large Databases using the CRAM LensRobert Chang, Pradeep Dogga, Andy Fingerhut, Victor Rios, George Varghese. 127-146 [doi]
- Quicksand: Harnessing Stranded Datacenter Resources with Granular ComputingZhenyuan Ruan, Shihang Li, Kaiyan Fan, Seo Jin Park, Marcos K. Aguilera, Adam Belay, Malte Schwarzkopf. 147-165 [doi]
- Beehive: A Scalable Disaggregated Memory Runtime Exploiting Asynchrony of Multithreaded ProgramsQuanxi Li, Hong Huang, Ying Liu, Yanwen Xia, Jie Zhang, Mosong Zhou, Xiaobing Feng 0002, Huimin Cui, Quan Chen, Yizhou Shan, Chenxi Wang. 167-187 [doi]
- Making Serverless Pay-For-Use a Reality with LeopardTingjia Cao, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Tyler Caraza-Harter. 189-204 [doi]
- GRANNY: Granular Management of Compute-Intensive Applications in the CloudCarlos Segarra, Simon Shillaker, Guo Li, Eleftheria Mappoura, Rodrigo Bruno, Lluís Vilanova, Peter R. Pietzuch. 205-218 [doi]
- On Temporal Verification of Stateful P4 ProgramsDelong Zhang, Chong Ye, Fei He 0001. 219-235 [doi]
- NDD: A Decision Diagram for Network VerificationZechun Li, Peng Zhang, Yichi Zhang, Hongkun Yang. 237-258 [doi]
- Smart Casual Verification of the Confidential Consortium FrameworkHeidi Howard, Markus A. Kuppe, Edward Ashton, Amaury Chamayou, Natacha Crooks. 259-276 [doi]
- VEP: A Two-stage Verification Toolchain for Full eBPF ProgrammabilityXiwei Wu, Yueyang Feng, Tianyi Huang, Xiaoyang Lu, Shengkai Lin, Lihan Xie, Shizhen Zhao, Qinxiang Cao. 277-299 [doi]
- MeshTest: End-to-End Testing for Service Mesh Traffic ManagementNaiqian Zheng, Tianshuo Qiao, Xuanzhe Liu, Xin Jin 0008. 301-316 [doi]
- Preventing Network Bottlenecks: Accelerating Datacenter Services with Hotspot-Aware Placement for Compute and StorageHamid Hajabdolali Bazzaz, Yingjie Bi, Weiwu Pang, Minlan Yu, Ramesh Govindan, Neal Cardwell, Nandita Dukkipati, Meng-Jung Tsai, Chris DeForeest, Yuxue Jin, Charles J. Carver, Jan Kopanski, Liqun Cheng, Amin Vahdat. 317-333 [doi]
- Enhancing Network Failure Mitigation with Performance-Aware RankingPooria Namyar, Arvin Ghavidel, Daniel Crankshaw, Daniel S. Berger, Kevin Hsieh, Srikanth Kandula, Ramesh Govindan, Behnaz Arzani. 335-357 [doi]
- One-Size-Fits-None: Understanding and Enhancing Slow-Fault Tolerance in Modern Distributed SystemsRuiming Lu, Yunchi Lu, Yuxuan Jiang, Guangtao Xue, Peng Huang. 359-378 [doi]
- Pyrrha: Congestion-Root-Based Flow Control to Eliminate Head-of-Line Blocking in DatacenterKexin Liu, Zhaochen Zhang, Chang Liu 0001, Yizhi Wang, Vamsi Addanki, Stefan Schmid 0001, Qingyue Wang, Wei Chen, Xiaoliang Wang 0001, Jiaqi Zheng 0001, Wenhao Sun, Tao Wu, Ke Meng, Fei Chen, Weiguang Wang, Bingyang Liu, Wanchun Dou, Guihai Chen, Chen Tian 0001. 379-405 [doi]
- eTran: Extensible Kernel Transport with eBPFZhongjie Chen, Qingkai Meng, ChonLam Lao, Yifan Liu, Fengyuan Ren, Minlan Yu, Yang Zhou. 407-425 [doi]
- White-Boxing RDMA with Packet-Granular Software ControlChenxingyu Zhao, Jaehong Min, Ming Liu 0027, Arvind Krishnamurthy. 427-449 [doi]
- SIRD: A Sender-Informed, Receiver-Driven Datacenter Transport ProtocolKonstantinos Prasopoulos, Ryan Kosta, Edouard Bugnion, Marios Kogias. 451-471 [doi]
- Accelerating Design Space Exploration for LLM Training Systems with Multi-experiment Parallel SimulationFei Gui, Kaihui Gao, Li Chen 0008, Dan Li 0001, Vincent Liu 0001, Ran Zhang, Hongbing Yang, Dian Xiong. 473-488 [doi]
- Optimizing RLHF Training for Large Language Models with Stage FusionYinmin Zhong, Zili Zhang, Bingyang Wu, Shengyu Liu, Yukun Chen, Changyi Wan, Hanpeng Hu, Lei Xia, Ranchen Ming, Yibo Zhu, Xin Jin 0008. 489-503 [doi]
- Minder: Faulty Machine Detection for Large-scale Distributed Model TrainingYangtao Deng, Xiang Shi, Zhuo Jiang, Xingjian Zhang, Lei Zhang, Zhang Zhang 0003, Bo Li, Zuquan Song, Hang Zhu, Gaohong Liu, Fuliang Li, Shuguang Wang, Haibin Lin, Jianxi Ye, Minlan Yu. 505-521 [doi]
- Holmes: Localizing Irregularities in LLM Training with Mega-scale GPU ClustersZhiyi Yao, Pengbo Hu, Congcong Miao, Xuya Jia, Zuning Liang, Yuedong Xu, Chunzhi He, Hao Lu, Mingzhuo Chen, Xiang Li, Zekun He, Yachen Wang, Xianneng Zou, Junchen Jiang. 523-540 [doi]
- SimAI: Unifying Architecture Design and Performance Tuning for Large-Scale Large Language Model Training with Scalability and PrecisionXizheng Wang, Qingxu Li, Yichi Xu, Gang Lu, Dan Li 0001, Li Chen 0008, Heyang Zhou, Linkang Zheng, Sen Zhang, Yikai Zhu, Yang Liu, Pengcheng Zhang, Kun Qian, Kunling He, Jiaqi Gao, Ennan Zhai, Dennis Cai, Binzhang Fu. 541-558 [doi]
- ByteCheckpoint: A Unified Checkpointing System for Large Foundation Model DevelopmentBorui Wan, Mingji Han, Yiyao Sheng, Yanghua Peng, Haibin Lin, Mofan Zhang, Zhichao Lai, Menghan Yu, Junda Zhang, Zuquan Song, Xin Liu, Chuan Wu 0001. 559-578 [doi]
- Mowgli: Passively Learned Rate Control for Real-Time VideoNeil Agarwal, Rui Pan 0003, Francis Y. Yan, Ravi Netravali. 579-594 [doi]
- Dissecting and Streamlining the Interactive Loop of Mobile Cloud GamingYang Li 0092, Jiaxing Qiu, Hongyi Wang 0009, Zhenhua Li 0001, Feng Qian 0001, Jing Yang, Hao Lin 0005, Yunhao Liu 0001, Bo Xiao, Xiaokang Qin, Tianyin Xu. 595-611 [doi]
- Region-based Content Enhancement for Efficient Video Analytics at the EdgeWeijun Wang, Liang Mi, Shaowei Cen, Haipeng Dai 0001, Yuanchun Li, Xiaoming Fu 0001, Yunxin Liu. 613-633 [doi]
- Tooth: Toward Optimal Balance of Video QoE and Redundancy Cost by Fine-Grained FEC in Cloud Gaming StreamingCongkai An, Huanhuan Zhang, Shibo Wang, Jingyang Kang, Anfu Zhou, Liang Liu 0001, Huadong Ma, Zili Meng, Delei Ma, Yusheng Dong, Xiaogang Lei. 635-651 [doi]
- AsTree: An Audio Subscription Architecture Enabling Massive-Scale Multi-Party ConferencingTong Meng, Wenfeng Li, Chao Yuan, Changqing Yan, Le Zhang. 653-666 [doi]
- AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN TrainingGuanbin Xu, Zhihao Le, Yinhe Chen, Zhiqi Lin, Zewen Jin, Youshan Miao, Cheng Li. 667-683 [doi]
- OptiReduce: Resilient and Tail-Optimal AllReduce for Distributed Deep Learning in the CloudErtza Warraich, Omer Shabtai, Khalid Manaa, Shay Vargaftik, Yonatan Piasetzky, Matty Kadosh, Lalith Suresh, Muhammad Shahbaz 0001. 685-703 [doi]
- Efficient Direct-Connect Topologies for Collective CommunicationsLiangyu Zhao, Siddharth Pal, Tapan Chugh, Weiyang Wang, Jason Fantl, Prithwish Basu, Joud Khoury, Arvind Krishnamurthy. 705-737 [doi]
- SuperServe: Fine-Grained Inference Serving for Unpredictable WorkloadsAlind Khare, Dhruv Garg, Sukrit Kalra, Snigdha Grandhi, Ion Stoica, Alexey Tumanov. 739-758 [doi]
- Pineapple: Unifying Multi-Paxos and Atomic Shared RegistersTigran Bantikyan, Jonathan Zarnstorff, Te-Yen Chou, Lewis Tseng, Roberto Palmieri. 759-778 [doi]
- Ladder: A Convergence-based Structured DAG Blockchain for High Throughput and Low LatencyDengcheng Hu, Jianrong Wang, Xiulong Liu, Hao Xu, Xujing Wu, Muhammad Shahzad 0001, Guyue Liu, Keqiu Li. 779-794 [doi]
- Vegeta: Enabling Parallel Smart Contract Execution in Leaderless BlockchainsTianjing Xu, Yongqi Zhong, Yiming Zhang 0003, Ruofan Xiong, Jingjing Zhang 0002, Guangtao Xue, Shengyun Liu. 795-811 [doi]
- Shoal++: High Throughput DAG BFT Can Be Fast and Robust!Balaji Arun, Zekun Li 0009, Florian Suri-Payer, Sourav Das 0001, Alexander Spiegelman. 813-826 [doi]
- Learning Production-Optimized Congestion Control Selection for Alibaba Cloud CDNXuan Zeng, Haoran Xu, Chen Chen, Xumiao Zhang, Xiaoxi Zhang 0001, Xu Chen 0004, Guihai Chen, Yubing Qiu, Yiping Zhang, Chong Hao, Ennan Zhai. 827-845 [doi]
- GPU-Disaggregated Serving for Deep Learning Recommendation Models at ScaleLingyun Yang, Yongchen Wang, Yinghao Yu, Qizhen Weng, Jianbo Dong, Kan Liu, Chi Zhang, Yanyi Zi, Hao Li, Zechao Zhang, Nan Wang, Yu Dong, Menglei Zheng, Lanlan Xi, Xiaowei Lu, Liang Ye, Guodong Yang, Binzhang Fu, Tao Lan, Liping Zhang, Lin Qu, Wei Wang 0030. 847-863 [doi]
- Evolution of Aegis: Fault Diagnosis for AI Model Training Service in ProductionJianbo Dong, Kun Qian, Pengcheng Zhang, Zhilong Zheng, Liang Chen, Fei Feng, Yichi Xu, Yikai Zhu, Gang Lu, Xue Li, Zhihui Ren, Zhicheng Wang, Bin Luo, Peng Zhang, Yang Liu, Yanqing Chen, Yu Guan, Weicheng Wang, Chaojie Yang, Yang Zhang, Man Yuan, Hanyu Zhao, Yong Li, Zihan Zhao, Shan Li, Xianlong Zeng, Zhiping Yao, Binzhang Fu, Ennan Zhai, Wei Lin, Chao Wang, Dennis Cai. 865-881 [doi]
- PAPAYA Federated Analytics Stack: Engineering Privacy, Scalability and PracticalityHarish Srinivas, Graham Cormode, Mehrdad Honarkhah, Samuel Lurye, Jonathan Hehir, Lunwen He, George Hong, Ahmed Magdy, Dzmitry Huba, Kaikai Wang, Shen Guo, Shoubhik Bhattacharya. 883-898 [doi]
- HA/TCP: A Reliable and Scalable Framework for TCP Network FunctionsHaoyu Gu, Ali José Mashtizadeh, Bernard Wong 0001. 899-914 [doi]
- High-level Programming for Application NetworksXiangfeng Zhu, Yuyao Wang, Banruo Liu, Yongtong Wu, Nikola Bojanic, Jingrong Chen 0002, Gilbert Louis Bernstein, Arvind Krishnamurthy, Sam Kumar, Ratul Mahajan, Danyang Zhuo. 915-935 [doi]
- State-Compute Replication: Parallelizing High-Speed Stateful Packet ProcessingQiongwen Xu, Sebastiano Miano, Xiangyu Gao, Tao Wang 0088, Adithya Murugadass, Songyuan Zhang, Anirudh Sivaraman, Gianni Antichi, Srinivas Narayana. 937-957 [doi]
- MTP: Transport for In-Network ComputingTao Ji, Rohan Vardekar, Balajee Vamanan, Brent E. Stephens, Aditya Akella. 959-977 [doi]
- ONCache: A Cache-Based Low-Overhead Container Overlay NetworkShengkai Lin, Shizhen Zhao, Peirui Cao, Xinchi Han, quan Tian, Wenfeng Liu, Qi Wu, Donghai Han, Xinbing Wang. 979-998 [doi]
- GREEN: Carbon-efficient Resource Scheduling for Machine Learning ClustersKaiqiang Xu, Decang Sun, Han Tian, Junxue Zhang 0001, Kai Chen 0005. 999-1014 [doi]
- The Benefits and Limitations of User Interrupts for Preemptive Userspace SchedulingLinsong Guo, Danial Zuberi, Tal Garfinkel, Amy Ousterhout. 1015-1032 [doi]
- Securing Public Cloud Networks with Efficient Role-based Micro-SegmentationSathiya Kumaran Mani, Kevin Hsieh, Santiago Segarra, Ranveer Chandra, Yajie Zhou, Srikanth Kandula. 1033-1048 [doi]
- Mitigating Scalability Walls of RDMA-based Container NetworksWei Liu 0148, Kun Qian 0004, Zhenhua Li 0001, Feng Qian 0001, Tianyin Xu, Yunhao Liu 0001, Yu Guan, Shuhong Zhu, Hongfei Xu, Lanlan Xi, Chao Qin, Ennan Zhai. 1049-1065 [doi]
- Eden: Developer-Friendly Application-Integrated Far MemoryAnil Yelam, Stewart Grant, Saarth Deshpande, Nadav Amit, Radhika Niranjan Mysore, Amy Ousterhout, Marcos K. Aguilera, Alex C. Snoeren. 1067-1083 [doi]
- Achieving Wire-Latency Storage Systems by Exploiting Hardware ACKsQing Wang 0031, Jiwu Shu, Jing Wang, Yuhao Zhang 0006. 1085-1100 [doi]
- ODRP: On-Demand Remote Paging with Programmable RDMAZixuan Wang, Xingda Wei, Jinyu Gu 0001, Hongrui Xie, Rong Chen 0001, Haibo Chen 0001. 1101-1115 [doi]
- Understanding and Profiling NVMe-over-TCP Using ntprofYuyuan Kang, Ming Liu 0027. 1117-1136 [doi]
- Building an Elastic Block Storage over EBOFs Using Shadow ViewsSheng Jiang, Ming Liu 0027. 1137-1153 [doi]
- Pushing the Limits of In-Network Caching for Key-Value StoresGyuyeong Kim. 1155-1168 [doi]
- CellReplay: Towards accurate record-and-replay for cellular networksWilliam Sentosa, Balakrishnan Chandrasekaran 0002, Philip Brighten Godfrey, Haitham Hassanieh. 1169-1186 [doi]
- Large Network UWB Localization: Algorithms and ImplementationNakul Garg, Irtaza Shahid, Ramanujan K. Sheshadri, Karthikeyan Sundaresan, Nirupam Roy. 1187-1203 [doi]
- Towards Energy Efficient 5G vRAN ServersAnuj Kalia, Nikita Lazarev, Leyang Xue, Xenofon Foukas, Bozidar Radunovic, Francis Y. Yan. 1205-1219 [doi]
- Building Massive MIMO Baseband Processing on a Single-Node SupercomputerXincheng Xie, Wentao Hou, Zerui Guo, Ming Liu 0027. 1221-1242 [doi]
- Efficient Multi-WAN Transport for 5G with OTTERMary Hogan, Gerry Wan, Yiming Qiu, Sharad Agarwal, Ryan Beckett, Rachee Singh, Paramvir Bahl. 1243-1267 [doi]
- Verifying maximum link loads in a changing worldTibor Schneider, Stefano Vissicchio, Laurent Vanbever. 1269-1287 [doi]
- A Layered Formal Methods Approach to Answering Queue-related QueriesDivya Raghunathan, Maria Apostolaki, Aarti Gupta. 1289-1304 [doi]
- Runtime Protocol Refinement Checking for Distributed Protocol ImplementationsDing Ding, Zhanghan Wang, Jinyang Li 0001, Aurojit Panda. 1305-1326 [doi]
- CEGS: Configuration Example Generalizing SynthesizerJianmin Liu, Li Chen 0008, Dan Li 0001, Yukai Miao. 1327-1347 [doi]
- Suppressing BGP Zombies with Route Status TransparencyYosef Edery Anahory, Jie Kong, Nicholas Scaglione, Justin Furuness, Hemi Leibowitz, Amir Herzberg, Bing Wang 0001, Yossi Gilad. 1349-1366 [doi]
- ValidaTor: Domain Validation over TorJens Frieß, Haya Schulmann, Michael Waidner. 1367-1380 [doi]
- From Address Blocks to Authorized Prefixes: Redesigning RPKI ROV with a Hierarchical Hashing Scheme for Fast and Memory-Efficient ValidationZedong Ni, Yinbo Xu, Hui Zou, Yanbiao Li, Guang Cheng, Gaogang Xie. 1381-1397 [doi]
- PreAcher: Secure and Practical Password Pre-Authentication by Content Delivery NetworksShihan Lin, Suting Chen, Yunming Xiao, Yanqi Gu, Aleksandar Kuzmanovic, Xiaowei Yang 0001. 1399-1419 [doi]
- ClubHeap: A High-Speed and Scalable Priority Queue for Programmable Packet SchedulingZhikang Chen, Haoyu Song 0001, Zhiyu Zhang, Yang Xu 0010, Bin Liu 0001. 1421-1436 [doi]
- Self-Clocked Round-Robin Packet SchedulingErfan Sharafzadeh, Raymond Matson, Jean Tourrilhes, Puneet Sharma, Soudeh Ghorbani. 1437-1465 [doi]
- Everything Matters in Programmable Packet SchedulingAlbert Gran Alcoz, Balázs Vass, Pooria Namyar, Behnaz Arzani, Gábor Rétvári, Laurent Vanbever. 1467-1485 [doi]
- When P4 Meets Run-to-completion ArchitectureHao Zheng, Xin Yan, Wenbo Li, Jiaqi Zheng 0001, Xiaoliang Wang 0001, Qingqing Zhao, Luyou He, XiaoFei Lai, Feng Gao, Fuguang Huang, Wanchun Dou, Guihai Chen, Chen Tian 0001. 1487-1505 [doi]
- Mutant: Learning Congestion Control from Existing Protocols via Online Reinforcement LearningLorenzo Pappone, Alessio Sacco, Flavio Esposito. 1507-1522 [doi]
- CATO: End-to-End Optimization of ML-Based Traffic Analysis PipelinesGerry Wan, Shinan Liu, Francesco Bronzino, Nick Feamster, Zakir Durumeric. 1523-1540 [doi]
- Resolving Packets from Counters: Enabling Multi-scale Network Traffic Super Resolution via Composable Large Traffic ModelXizheng Wang, Libin Liu 0001, Li Chen 0008, Dan Li 0001, Yukai Miao, Yu Bai 0021. 1541-1561 [doi]
- BFTBrain: Adaptive BFT Consensus with Reinforcement LearningChenyuan Wu, Haoyun Qin, Mohammad Javad Amiri, Boon Thau Loo, Dahlia Malkhi, Ryan Marcus. 1563-1583 [doi]