Abstract is missing.
- Fast ACS: Low-Latency File-Based Ordered Message Delivery at ScaleSushant Kumar Gupta, Anil Raghunath Iyer, Chang Yu, Neel Bagora, Olivier Pomerleau, Vivek Kumar, Prunthaban Kanthakumar. 1-17 [doi]
- Poby: SmartNIC-accelerated Image Provisioning for Coldstart in CloudsZihao Chang, Jiaqi Zhu, Haifeng Sun 0004, Yunlong Xie, Kan Shi, Ninghui Sun, Yungang Bao, Sa Wang. 19-37 [doi]
- Burst Computing: Quick, Sudden, Massively Parallel Processing on Serverless ResourcesDaniel Barcelona Pons, Aitor Arjona, Pedro García López, Enrique Molina-Giménez, Stepan Klymonchuk. 39-56 [doi]
- DEEPSERVE: Serverless Large Language Model Serving at ScaleJunhao Hu, Jiang Xu, Zhixia Liu, Yulong He, Yuetao Chen, Hao Xu, Jiang Liu, Jie Meng, Baoquan Zhang, Shining Wan, Gengyuan Dan, Zhiyu Dong, Zhihao Ren, Changhong Liu, Tao Xie, Dayun Lin, Qin Zhang, Yue Yu, Hao Feng, Xusheng Chen, Yizhou Shan. 57-72 [doi]
- Cosmic: Cost-Effective Support for Cloud-Assisted 3D PrintingYuan Yao, Chuan He, Chinedum Emmanuel Okwudire, Harsha V. Madhyastha. 73-88 [doi]
- GMI-DRL: Empowering Multi-GPU DRL with Adaptive-Grained ParallelismYuke Wang, Boyuan Feng, Zheng Wang 0075, Guyue Huang, Tony Tong Geng, Ang Li 0006, Yufei Ding 0001. 89-103 [doi]
- mTuner: Accelerating Parameter-Efficient Fine-Tuning on Multi-GPU Servers with Elastic TensorKezhao Huang, Siqi Zhu, Mingshu Zhai, Liyan Zheng 0001, Kinman Lei, Jiaao He, Yuyang Jin 0001, Jidong Zhai. 105-121 [doi]
- JENGA: Enhancing LLM Long-Context Fine-tuning with Contextual Token SparsityTuowei Wang, Xingyu Chen, Kun Li, Ting Cao, Ju Ren 0001, Yaoxue Zhang. 123-141 [doi]
- FlexPipe: Maximizing Training Efficiency for Transformer-based Models with Variable-Length InputsHairui Zhao, Qi Tian, Hongliang Li 0003, Zizhong Chen. 143-159 [doi]
- Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble ExploitationWeiqi Feng, Yangrui Chen, Shaoyu Wang, Yanghua Peng, Haibin Lin, Minlan Yu. 161-177 [doi]
- Towards Optimal Rack-scale μs-level CPU Scheduling through In-Network Workload ShapingXudong Liao, Han Tian, Xinchen Wan, Chaoliang Zeng, Hao Wang 0116, Junxue Zhang 0001, Mengyu Ma, Guyue Liu, Kai Chen 0005. 179-198 [doi]
- TGW: Operating an Efficient and Resilient Cloud Gateway at ScaleYifan Yang 0009, Lin He 0004, Jiasheng Zhou, Xiaoyi Shi, Yichi Xu, Shicheng Wang, Jinlong E, Ying Liu 0024, Junwei Zhang, Zhuang Yuan, Hengyang Xu. 199-215 [doi]
- MARC: Motion-Aware Rate Control for Mobile E-commerce Cloud RenderingYuankang Zhao, Furong Yang, Gerui Lv, Qinghua Wu 0004, Yanmei Liu, Jiuhai Zhang, Yutang Peng, Feng Peng, Hongyu Guo, Ying Chen 0011, Zhenyu Li 0001, Gaogang Xie. 217-232 [doi]
- Accelerating Distributed Graph Learning by Using Collaborative In-Network Multicast and AggregationZhaoyi Li, Jiawei Huang 0001, Yijun Li 0002, Jingling Liu, Junxue Zhang 0001, Hui Li, Xiaojun Zhu, Shengwen Zhou, Jing Shao, Xiaojuan Lu, Qichen Su, Jianxin Wang 0001, Chee-Wei Tan 0001, Yong Cui 0001, Kai Chen 0005. 233-247 [doi]
- Opening Up Kernel-Bypass TCP StacksShinichi Awamoto, Michio Honda. 249-262 [doi]
- GPREEMPT: GPU Preemptive Scheduling Made General and EfficientRuwen Fan, Tingxu Ren, Minhui Xie, Shiwei Gao, Jiwu Shu, Youyou Lu. 263-272 [doi]
- μEFI: A Microkernel-Style UEFI with Isolation and TransparencyLe Chen, Yiyang Wu, Jinyu Gu 0001, Yubin Xia, Haibo Chen 0001. 273-289 [doi]
- PageFlex: Flexible and Efficient User-space Delegation of Linux Paging Policies with eBPFAnil Yelam, Kan Wu, Zhiyuan Guo, Suli Yang, Rajath Shashidhara, Wei Xu, Stanko Novakovic, Alex C. Snoeren, Kimberly Keeton. 291-306 [doi]
- ASTERINAS: A Linux ABI-Compatible, Rust-Based Framekernel OS with a Small and Sound TCBYuke Peng, Hongliang Tian, Junyang Zhang, Ruihan Li, Chengjun Chen, Jianfeng Jiang, Jinyi Xian, Xiaolin Wang, Chenren Xu, Diyu Zhou, Yingwei Luo, Shoumeng Yan, Yinqian Zhang. 307-323 [doi]
- Rex: Closing the language-verifier gap with safe and usable kernel extensionsJinghao Jia, Ruowen Qin, Milo Craun, Egor Lukiyanov, Ayush Bansal, Minh Phan, Michael V. Le, Hubertus Franke, Hani Jamjoom, Tianyin Xu, Dan Williams 0001. 325-342 [doi]
- Barre: Empowering Simplified and Versatile Programmable Congestion Control in High-Speed AI ClustersYajuan Peng, Haoran Wei, Xiaolong Zhong, Junkai Huang, Haohan Xu, ZiCheng Wang, Yang Bai, Zhuo Jiang, Jianxi Ye, Xiaoliang Wang 0001, Xiaoming Fu 0001, Huichen Dai. 343-363 [doi]
- FLB: Fine-grained Load Balancing for Lossless Datacenter NetworksJinbin Hu, Wenxue Li, Xiangzhou Liu, Junfeng Wang, Bowen Liu, Ping Yin, Jianxin Wang 0001, Jiawei Huang 0001, Kai Chen 0005. 365-380 [doi]
- SNARY: A High-Performance and Generic SmartNIC-accelerated Retrieval SystemQiaoyin Gan, Heng Pan, Luyang Li, Kai Lv, Hongtao Guan, Zhaohua Wang, Zhenyu Li 0001, Gaogang Xie. 381-398 [doi]
- Minos : A Lightweight and Dynamic Defense against Traffic Analysis in Programmable Data PlanesZihao Wang, Qing Li 0006, Guorui Xie, Dan Zhao 0003, Kejun Li, Zhuochen Fan, Lianbo Ma, Yong Jiang 0001. 399-415 [doi]
- GeneralSparse: Bridging the Gap in SpMM for Pruned Large Language Model Inference on GPUsYaoyu Wang, Xiao Guo, Junmin Xiao, De Chen, Guangming Tan. 417-432 [doi]
- HyCache: Hybrid Caching for Accelerating DNN Input Preprocessing PipelinesKeshav Vinayak Jha, Shweta Pandey, Murali Annavaram, Arkaprava Basu. 433-448 [doi]
- The Koala Benchmarks for the Shell: Characterization and ImplicationsEvangelos Lamprou, Ethan Williams, Georgios Kaoukis, Zhuoxuan Zhang, Michael Greenberg 0002, Konstantinos Kallas, Lukas Lazarek, Nikos Vasilakis. 449-464 [doi]
- KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud ProviderJiahao Wang, Jinbo Han, Xingda Wei, Sijie Shen, Dingyan Zhang, Chenguang Fang, Rong Chen 0001, Wenyuan Yu, Haibo Chen 0001. 465-482 [doi]
- LogCrisp: Fast Aggregated Analysis on Large-scale Compressed Logs by Enabling Two-Phase Pattern Extraction and Vectorized QueriesJunyu Wei, Guangyan Zhang, Junchao Chen 0005, Qi Zhou 0001. 483-496 [doi]
- HotRAP: Hot Record Retention and Promotion for LSM-trees with Tiered StorageJiansheng Qiu, Fangzhou Yuan, Mingyu Gao, Huanchen Zhang. 497-511 [doi]
- Mitigating Resource Usage Dependency in Sorting-based KV Stores on Hybrid Storage Devices via Operation DecouplingQingyang Zhang, Yongkun Li 0001, Yubiao Pan, Haoting Tang, Yinlong Xu. 513-529 [doi]
- SolFS: An Operation-Log Versioning File System for Hash-free Efficient Mobile Cloud BackupRiwei Pan, Yu Liang 0004, Lei Li, Hongchao Du, Tei-Wei Kuo, Chun Jason Xue. 531-545 [doi]
- Z-LFS: A Zoned Namespace-tailored Log-structured File System for Commodity Small-zone ZNS SSDsInhwi Hwang, Sangjin Lee 0003, Sunggon Kim, Hyeonsang Eom, Yongseok Son. 547-562 [doi]
- CLONE: Customizing LLMs for Efficient Latency-Aware Inference at the EdgeChunlin Tian, Xinpeng Qin, Kahou Tam, Li Li 0064, Zijian Wang, Yuanzhe Zhao, Minglei Zhang, Chengzhong Xu 0001. 563-585 [doi]
- Weaver: Efficient Multi-LLM Serving with Attention OffloadingShiwei Gao, Qing Wang 0031, Shaoxun Zeng, Youyou Lu, Jiwu Shu. 587-595 [doi]
- Torpor: GPU-Enabled Serverless Computing for Low-Latency, Resource-Efficient InferenceMinchen Yu, Ao Wang, Dong Chen, Haoxuan Yu, Xiaonan Luo, Zhuohao Li, Wei Wang 0030, Ruichuan Chen, Dapeng Nie, Haoran Yang, Yu Ding. 597-612 [doi]
- Toppings: CPU-Assisted, Rank-Aware Adapter Serving for LLM InferenceSuyi Li 0002, Hanfeng Lu, Tianyuan Wu, Minchen Yu, Qizhen Weng, Xusheng Chen, Yizhou Shan, Binhang Yuan, Wei Wang 0030. 613-629 [doi]
- QFactory: Accelerating Quantized Large Language Model Serving with Qtile GraphsQihao Zhang, Mingshu Zhai, Rui Sun, Jidong Zhai. 631-646 [doi]
- PluS: Highly Efficient and Expandable ML Compiler with Pluggable Graph SchedulesRuofan Wu, Zhen Zheng, Feng Zhang 0007, Chuanjie Liu, Zaifeng Pan, Jidong Zhai, Xiaoyong Du 0001. 647-663 [doi]
- Obscura: Concealing Recomputation Overhead in Training of Large Language Models with Bubble-filling Pipeline TransformationYuzhou Huang, Yapeng Jiang, Zicong Hong, Wuhui Chen, Bin Wang 0034, Weixi Zhu, Yue Yu 0001, Zibin Zheng. 665-678 [doi]
- PPipe: Efficient Video Analytics Serving on Heterogeneous GPU Clusters via Pool-Based Pipeline ParallelismZ. Jonny Kong, Qiang Xu 0006, Y. Charlie Hu. 679-698 [doi]
- Voltrix: Sparse Matrix-Matrix Multiplication on Tensor Cores with Asynchronous and Balanced Kernel OptimizationYaqi Xia, Weihu Wang, Donglin Yang, Xiaobo Zhou 0002, Dazhao Cheng. 699-714 [doi]
- NetKeeper: Enhancing Network Resilience with Autonomous Network Configuration Update on Traffic Patterns and AnomaliesZhaoyang Wan, Rongxin Han, Haifeng Sun 0001, Qi Qi 0001, Zirui Zhuang, Bo He 0003, Liang Zhang, Jianxin Liao, Jingyu Wang 0001. 715-730 [doi]
- GREYHOUND: Hunting Fail-Slows in Hybrid-Parallel Training at ScaleTianyuan Wu, Wei Wang 0030, Yinghao Yu, Siran Yang, Wenchao Wu, Qinkai Duan, Guodong Yang, Jiamang Wang, Lin Qu, Liping Zhang. 731-747 [doi]
- Crash Consistency in Block-Level Caching Systems: An Open CAS Case StudyShaohua Duan, Youmin Chen. 749-764 [doi]
- FiDe: Reliable and Fast Crash Failure Detection to Boost Datacenter CoordinationDavide Rovelli, Pavel Chuprikov, Philipp Berdesinski, Ali Pahlevan, Patrick Jahnke, Patrick Eugster. 765-788 [doi]
- LEOCraft: Towards Designing Performant LEO NetworksSuvam Basak, Amitangshu Pal, Debopam Bhattacherjee. 789-813 [doi]
- Emulating Space Computing Networks with RHONELiying Wang, Qing Li 0028, Yuhan Zhou, Zhaofeng Luo, Donghao Zhang, Shangguang Wang, Xuanzhe Liu, Chenren Xu. 815-831 [doi]
- Roaming Free in the VR World with MP2Yifei Xu, Xumiao Zhang, Yuning Chen, Pan Hu, Xuan Zeng, Zhilong Zheng, Xianshang Lin, Yanmei Liu, Songwu Lu, Z. Morley Mao, Wan Du, Dennis Cai, Ennan Zhai, Yunfei Ma. 833-850 [doi]
- STORM: a Multipath QUIC Scheduler for Quick Streaming Media Transport under Unstable Mobile NetworksLiekun Hu, Changlong Li. 851-866 [doi]
- Internet Connection Splitting: What's Old is New AgainGina Yuan, Thea Rossman, Keith Winstein. 867-887 [doi]
- WIC: Hiding Producer-Consumer Synchronization Delays with Warp-Level Interrupt-based GPU CommunicationsJiajian Zhang, Fangyu Wu 0001, Hai Jiang 0003, Qiufeng Wang 0001, Genlang Chen, Chaoyi Pang. 889-904 [doi]
- Primus: Unified Training System for Large-Scale Deep Learning Recommendation ModelsJixi Shan, Xiuqi Huang, Yang Guo, Hongyue Mao, Ho-Pang Hsu, Hang Cheng, Can Wang, Jun Song, Rui Shi, Xiaofeng Gao 0001, Jingwei Xu, Shiru Ren, Jiaxiao Zheng, Hua Huang, Lele Yu, Peng Xu, Guihai Chen. 905-922 [doi]
- Chitu: Avoiding Unnecessary Fallback in Byzantine ConsensusRongji Huang, Xiangzhe Wang, Xiaofeng Yan, Lei Fan 0002, Guangtao Xue, Shengyun Liu. 923-942 [doi]
- Fast Distributed Transactions for RDMA-based Disaggregated MemoryHaodi Lu, Haikun Liu, Yujian Zhang, Zhuohui Duan, Xiaofei Liao, Hai Jin 0001, Yu Zhang 0027. 943-958 [doi]
- Cuckoo for Clients: Disaggregated Cuckoo HashingStewart Grant, Alex C. Snoeren. 959-972 [doi]
- LITESHIELD: Secure Containers via Lightweight, Composable Userspace μKernel ServicesKaesi Manakkal, Nathan Daughety, Marcus Pendleton, Hui Lu 0001. 973-985 [doi]
- Accelerating Nested Virtualization with HyperTurtleOri Ben Zur, Jakob Krebs, Shai Aviram Bergman, Mark Silberstein. 987-1002 [doi]
- Efficient Performance-Aware GPU Sharing with Compatibility and Isolation through Kernel Space InterceptionShulai Zhang, Ao Xu, Quan Chen 0002, Han Zhao 0005, Weihao Cui, Zhen Wang, Yan Li, Limin Xiao, Minyi Guo. 1003-1019 [doi]
- AnchorNet: Bridging Live and Collaborative Streaming with a Unified ArchitectureTong Meng, Wei Zhang, Dong Chen, Zhen Wang, Quanqing Li, Changqing Yan, Wei Yang, Chao Yuan, Le Zhang, Jianxin Kuang, Jianlin Xu. 1021-1036 [doi]
- Katz: Efficient Workflow Serving for Diffusion Models with Many AdaptersSuyi Li 0002, Lingyun Yang, Xiaoxiao Jiang, Hanfeng Lu, Dakai An, Zhipeng Di, Weiyi Lu, Jiawei Chen, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang 0030. 1037-1052 [doi]
- PopFetcher: Towards Accelerated Mixture-of-Experts Training Via Popularity Based Expert-Wise PrefetchJunyi Zhang, Chuanhu Ma, Xiong Wang, Yuntao Nie, Yuqing Li, Yuedong Xu 0001, Xiaofei Liao, Bo Li 0001, Hai Jin 0001. 1053-1069 [doi]
- HypeReca: Distributed Heterogeneous In-Memory Embedding Database for Training Recommender ModelsJiaao He, Shengqi Chen 0001, Kezhao Huang, Jidong Zhai. 1071-1087 [doi]
- CrossPipe: Towards Optimal Pipeline Schedules for Cross-Datacenter TrainingTiancheng Chen, Ales Kubicek, Langwen Huang, Torsten Hoefler. 1089-1108 [doi]
- Unveiling Compiler Faults via Attribute-Guided Compilation Space ExplorationJiangchang Wu, Yibiao Yang, Maolin Sun, Yuming Zhou. 1109-1125 [doi]
- Understanding and Detecting Fail-Slow Hardware Failure Bugs in Cloud SystemsGen Dong, Yu Hua 0001, Yongle Zhang, Zhangyu Chen, Menglei Chen. 1127-1142 [doi]
- Converos: Practical Model Checking for Verifying Rust OS Kernel ConcurrencyRuize Tang, Minghua Wang, Xudong Sun 0013, Lin Huang, Yu Huang 0002, Xiaoxing Ma. 1143-1159 [doi]
- Bin2Wrong: a Unified Fuzzing Framework for Uncovering Semantic Errors in Binary-to-C DecompilersZao Yang, Stefan Nagy. 1161-1179 [doi]
- HEC: Equivalence Verification Checking for Code Transformation via Equality SaturationJiaqi Yin, Zhan Song, Nicolas Bohm Agostini, Antonino Tumeo, Cunxi Yu. 1181-1196 [doi]
- Para-ksm: Parallelized Memory Deduplication with Data Streaming AcceleratorHouxiang Ji, Minho Kim, Seonmu Oh, Daehoon Kim, Nam Sung Kim. 1197-1212 [doi]
- DSA-2LM: A CPU-Free Tiered Memory Architecture with Intel DSARuili Liu, Teng Ma, Mingxing Zhang, Jialiang Huang, Yingdi Shan, Zheng Liu, Lingfeng Xiang, Zhen Lin, Hui Lu 0001, Jia Rao, Kang Chen, Yongwei Wu 0001. 1213-1222 [doi]
- Turbocharge ANNS on Real Processing-in-Memory by Enabling Fine-Grained Per-PIM-Core SchedulingPuqing Wu, Minhui Xie, Enrui Zhao, Dafang Zhang, Jing Wang, Xiao Liang, Kai Ren, Yunpeng Chai. 1223-1241 [doi]
- SwCC: Software-Programmable and Per-Packet Congestion Control in RDMA EngineHongjing Huang, Jie Zhang 0081, Xuzheng Chen, Ziyu Song, Jiajun Qin, Zeke Wang. 1243-1260 [doi]
- DRack: A CXL-Disaggregated Rack Architecture to Boost Inter-Rack CommunicationXu Zhang, Ke Liu 0004, Yuan Hui, Xiaolong Zheng, Yisong Chang, Yizhou Shan, Guanghui Zhang, Ke Zhang 0017, Yungang Bao, Mingyu Chen 0001, Chenxi Wang. 1261-1279 [doi]
- ShieldReduce: Fine-Grained Shielded Data ReductionJingyuan Yang, Jun Wu, Ruilin Wu, Jingwei Li 0001, Patrick P. C. Lee, Xiong Li 0002, Xiaosong Zhang 0001. 1281-1296 [doi]
- MemoryTrap: Booby Trapping Memory to Counter Memory Disclosure Attacks with Hardware SupportChenke Luo, Jiang Ming 0002, Dongpeng Xu 0001, Guojun Peng, Jianming Fu. 1297-1318 [doi]
- Separate but Together: Integrating Remote Attestation into TLSCarsten Weinhold, Muhammad Usama Sardar, Ionut Mihalcea, Yogesh Deshpande, Hannes Tschofenig, Yaron Sheffer, Thomas Fossati, Michael Roitzsch. 1319-1326 [doi]
- DDLumos: Understanding and Detecting Atomic DDL Bugs in DBMSsZhiyong Wu 0010, Jie Liang 0006, Jingzhou Fu, Wenqian Deng, Yu Jiang 0001. 1327-1341 [doi]
- SpaceExit: Enabling Efficient Adaptive Computing in Space with Early ExitsJiacheng Liu 0001, Xiaozhi Zhu, Tongqiao Xu, Xiaofeng Hou, Chao Li 0009. 1343-1358 [doi]
- XRT: An Accelerator-Aware Runtime for Accelerated Chip MultiprocessorsNeel Patel, Mohammad Alian. 1359-1369 [doi]
- DShuffle: DPU-Optimized Shuffle Framework for Large-scale Data ProcessingChen Ding, Sicen Li, Kai Lu, Ting Yao, Daohui Wang, Huatao Wu, Jiguang Wan, Zhihu Tan, Changsheng Xie. 1371-1386 [doi]
- Accelerating Model Training on Ascend Chips: An Industrial System for Profiling, Analysis and OptimizationYuhang Zhou, Zibo Wang, Zhibin Wang, Ruyi Zhang 0005, Chen Tian 0001, Xiaoliang Wang 0001, Wanchun Dou, Guihai Chen, Bingqiang Wang, Yonghong Tian 0001, Yan Zhang, Hui Wang, Fuchun Wei, Boquan Sun, Jingyi Zhang, Bin She, Teng Su, Yifan Yao, Chunsheng Li, Ziyang Zhang, Yaoyuan Wang, Bin Zhou, Guyue Liu. 1387-1408 [doi]
- CAFault: Enhance Fault Injection Technique in Practical Distributed Systems via Abundant Fault-Dependent ConfigurationsYuanliang Chen, Fuchen Ma, Yuanhang Zhou, Zhen Yan, Yu Jiang 0001. 1409-1424 [doi]
- Revealing Floating-Point Accumulation Orders in Software/Hardware ImplementationsPeichen Xie, Yanjie Gao, Yang Wang, Jilong Xue. 1425-1440 [doi]
- Inferring Likely Counting-related Atomicity Program Properties for Persistent MemoryYunmo Zhang, Junqiao Qiu, Hong Xu 0001, Chun Jason Xue. 1441-1450 [doi]
- Optimizing Input Minimization in Kernel FuzzingHui Guo, Hao Sun, Shan Huang, Ting Su 0001, Geguang Pu, Shaohua Li 0002. 1451-1465 [doi]
- IRHash: Efficient Multi-Language Compiler Caching by IR-Level HashingTobias Landsberg, Johannes Grunenberg, Christian Dietrich 0001, Daniel Lohmann. 1467-1479 [doi]
- On-Demand Container Partitioning for Distributed MLGiovanni Bartolomeo, Navidreza Asadi, Wolfgang Kellerer, Jörg Ott, Nitinder Mohan. 1481-1500 [doi]
- PathWeaver: A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor SearchSukjin Kim, Seongyeon Park, Si Ung Noh, JungUk Hong, Taehee Kwon, Hunseong Lim, Jinho Lee. 1501-1517 [doi]
- Universal Checkpointing: A Flexible and Efficient Distributed Checkpointing System for Large-Scale DNN Training with Reconfigurable ParallelismXinyu Lian, Sam Ade Jacobs, Lev Kurilenko, Masahiro Tanaka, Stas Bekman, Olatunji Ruwase, Minjia Zhang. 1519-1534 [doi]
- Towards High-Performance Transactional Stateful Serverless Workflows with Affinity-Aware LeasingJianjun Zhao 0003, Haikun Liu, Shuhao Zhang 0001, Haodi Lu, Yancan Mao, Zhuohui Duan, Xiaofei Liao, Hai Jin 0001. 1535-1551 [doi]
- Swift: Fast Performance Tuning with GAN-Generated ConfigurationsChao Chen 0022, Shixin Huang, Xuehai Qian, Zhibin Yu 0001. 1553-1568 [doi]
- PMR: Fast Application Response via Parallel Memory Reclaim on Mobile DevicesWentong Li 0002, Li-Pin Chang, Yu Mao, Liang Shi 0001. 1569-1584 [doi]
- SAVE: Software-Implemented Fault Tolerance for Model Inference against GPU Memory Bit FlipsWenxin Zheng, Bin Xu, Jinyu Gu 0001, Haibo Chen 0001. 1585-1604 [doi]
- Identifying and Analyzing Pitfalls in GNN SystemsYidong Gong, Arnab Kanti Tarafder, Saima Afrin, Pradeep Kumar. 1605-1624 [doi]
- Bluetooth Low Energy Security Testing with Combinatorial MethodsDominik-Philip Schreiber, Manuel Leithner, Jovan Zivanovic, Dimitris E. Simos. 1625-1638 [doi]
- Resource Multiplexing in Tuning and Serving Large Language ModelsYongjun He 0004, Haofeng Yang, Yao Lu, Ana Klimovic, Gustavo Alonso. 1639-1655 [doi]
- Colocating ML Inference and Training with Fast GPU Memory HandoverJiali Wang, Yankui Wang, Mingcong Han, Rong Chen 0001. 1657-1675 [doi]
- AssyLLM: Efficient Federated Fine-tuning of LLMs via Assembling Pre-trained BlocksShichen Zhan, Li Li 0064, Chengzhong Xu 0001. 1677-1691 [doi]
- Learning-Enhanced High-Throughput Pattern Matching Based on Programmable Data PlaneGuanglin Duan, Yucheng Huang, Zhengxin Zhang, Qing Li 0006, Dan Zhao 0003, Zili Meng, Dirk Kutscher, Ruoyu Li 0003, Yong Jiang 0001, Mingwei Xu. 1693-1712 [doi]