Journal: IEEE Trans. Parallel Distrib. Syst.

Volume 35, Issue 9

1524 -- 1535Yi-Wei Ci, Michael R. Lyu, Zhan Zhang, De-Cheng Zuo, Xiao-Zong Yang. KLNK: Expanding Page Boundaries in a Distributed Shared Memory System
1536 -- 1550Sheng Qi, Chao Jin, Mosharaf Chowdhury, Zhenming Liu, Xuanzhe Liu, Xin Jin 0008. Pyxis: Scheduling Mixed Tasks in Disaggregated Datacenters
1551 -- 1564Ahmad Tarraf, Martin Schreiber 0001, Alberto Cascajo, Jean-Baptiste Besnard, Marc-André Vef, Dominik Huber, Sonja Happ, André Brinkmann, David E. Singh, Hans-Christian Hoppe, Alberto Miranda, Antonio J. Peña, Rui Machado, Marta Garcia-Gasulla, Martin Schulz 0001, Paul M. Carpenter, Simon Pickartz, Tiberiu Rotaru, Sergio Iserte, Víctor López 0003, Jorge Ejarque, Heena Sirwani, Jesús Carretero 0001, Felix Wolf 0001. Malleability in Modern HPC Systems: Current Experiences, Challenges, and Future Opportunities
1565 -- 1582Jiuchen Shi, Kaihua Fu, Jiawen Wang, Quan Chen 0002, Deze Zeng, Minyi Guo. Adaptive QoS-Aware Microservice Deployment With Excessive Loads via Intra- and Inter-Datacenter Scheduling
1583 -- 1597Rong Cong, Zhiwei Zhao, Linyuanqi Zhang, Geyong Min. Cost-Effective Server Deployment for Multi-Access Edge Networks: A Cooperative Scheme
1598 -- 1614Yifan Hua, Shengan Zheng, Weihan Kong, Cong Zhou, Kaixin Huang, Ruoyan Ma, Linpeng Huang. RADAR: A Skew-Resistant and Hotness-Aware Ordered Index Design for Processing-in-Memory Systems
1615 -- 1629Dhruv Gajaria, Kevin Antony Gomez, Tosiron Adegbija. STT-RAM-Based Hierarchical in-Memory Computing
1630 -- 1643Ran Wang 0014, Cheng Xu 0003, Xiaotong Zhang 0002. Toward Materials Genome Big-Data: A Blockchain-Based Secure Storage and Efficient Retrieval Method
1644 -- 1656Yuchen Zhong, Guangming Sheng, Juncheng Liu, Jinhui Yuan, Chuan Wu 0001. Swift: Expedited Failure Recovery for Large-Scale DNN Training
1657 -- 1671Gabriele Mencagli, Patrizio Dazzi, Massimo Coppola. Springald: GPU-Accelerated Window-Based Aggregates Over Out-of-Order Data Streams
1672 -- 1689Cunyang Wei, Haipeng Jia, Yunquan Zhang, Jianyu Yao, Chendi Li, Wenxuan Cao. IrGEMM: An Input-Aware Tuning Framework for Irregular GEMM on ARM and X86 CPUs

Volume 35, Issue 8

1331 -- 1344Jinfan Chen, Shigang Li 0002, Ran Guo, Jinhui Yuan, Torsten Hoefler. AutoDDL: Automatic Distributed Deep Learning With Near-Optimal Bandwidth Cost
1345 -- 1359Isra Mohamed Ali, Mohamed M. Abdallah 0001. On Off-Chaining Smart Contract Runtime Protection: A Queuing Model Approach
1360 -- 1372Yanxi Zhang, Muyu Mei, Dongqi Yan, Xu Zhang, Qinghai Yang, Mingwu Yao. Age-of-Event Aware: Sampling Period Optimization in a Three-Stage Wireless Cyber-Physical System With Diverse Parallelisms
1373 -- 1386Yang Zhou, Fang Wang 0001, Zhan Shi 0001, Dan Feng 0001. The Static Allocation is Not a Static: Optimizing SSD Address Allocation Through Boosting Static Policy
1387 -- 1399Changmao Wu, Zhengwei Xu 0001, Xiaoming He, Qi Lou, Yuanyuan Xia, Shuman Huang. Proactive Caching With Distributed Deep Reinforcement Learning in 6G Cloud-Edge Collaboration Computing
1400 -- 1414Xiaofeng Hou, Xuehan Tang, Jiacheng Liu 0001, Chao Li 0009, Luhong Liang, Kwang-Ting Cheng. WASP: Efficient Power Management Enabling Workload-Aware, Self-Powered AIoT Devices
1415 -- 1428Shengwei Li, Kai Lu, Zhiquan Lai, Weijie Liu, Keshi Ge, Dong Sheng Li 0001. A Multidimensional Communication Scheduling Method for Hybrid Parallel DNN Training
1429 -- 1443Jingwen Zhou, Feifei Chen 0001, Guangming Cui, Yong Xiang 0001, Qiang He 0001. FEUAGame: Fairness-Aware Edge User Allocation for App Vendors
1444 -- 1455Jiantong Jiang, Zeyi Wen, Atif Bin Mansoor, Ajmal Mian. Faster-BNI: Fast Parallel Exact Inference on Bayesian Networks
1456 -- 1468Xinliang Wei, Kejiang Ye, Xinghua Shi, Cheng-Zhong Xu 0001, Yu Wang. Joint Participant and Learning Topology Selection for Federated Learning in Edge Clouds
1469 -- 1487Chengying Huan, Yongchao Liu, Heng Zhang 0005, Hang Liu 0001, Shiyang Chen, Shuaiwen Leon Song, Yanjun Wu. TeGraph+: Scalable Temporal Graph Processing Enabling Flexible Edge Modifications
1488 -- 1505Liang Geng, Hao Wang 0002, Jingsong Meng, Dayi Fan, Sami Ben-romdhane, Hari Kadayam Pichumani, Vinay Phegade, Xiaodong Zhang. RR-Compound: RDMA-Fused gRPC for Low Latency, High Throughput, and Easy Interface
1506 -- 1523Junxue Zhang 0001, Xiaodian Cheng, Liu Yang 0008, Jinbin Hu, Han Tian, Kai Chen 0005. High-Performance Hardware Acceleration Architecture for Cross-Silo Federated Learning

Volume 35, Issue 7

1122 -- 1138Runzhen Xue, Dengke Han, Mingyu Yan, Mo Zou, Xiaocheng Yang, Duo Wang, Wenming Li, Zhimin Tang, John Kim, Xiaochun Ye, Dongrui Fan. HiHGNN: Accelerating HGNNs Through Parallelism and Data Reusability Exploitation
1139 -- 1154Bowen Zhang 0004, Huaxi Gu, Grace Li Zhang, Yintang Yang, Ziteng Ma, Ulf Schlichtmann. A 3D Hybrid Optical-Electrical NoC Using Novel Mapping Strategy Based DCNN Dataflow Acceleration
1155 -- 1173Chen Chen 0067, Hong Xu 0001, Wei Wang 0030, Baochun Li, Bo Li 0001, Li Chen 0008, Gong Zhang 0001. Synchronize Only the Immature Parameters: Communication-Efficient Federated Learning By Freezing Parameters Adaptively
1174 -- 1188Xiaqing Li, Qi Guo 0001, Guangyan Zhang, Siwei Ye, Guanhua He, Yiheng Yao, Rui Zhang 0040, Yifan Hao, Zidong Du, Weimin Zheng. FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space
1189 -- 1206Linsi Lan, Junbo Wang 0001, Zhi Li, Krishna Kant 0001, Wanquan Liu. FedREM: Guided Federated Learning in the Presence of Dynamic Device Unpredictability
1207 -- 1220Rahul Mishra 0001, Hari Prabhat Gupta, Garvit Banga, Sajal K. Das 0001. Fed-RAC: Resource-Aware Clustering for Tackling Heterogeneity of Participants in Federated Learning
1221 -- 1238Yuyang Jin, Haojie Wang, Runxin Zhong, Chen Zhang, Xia Liao, Feng Zhang, Jidong Zhai. Graph-Centric Performance Analysis for Large-Scale Parallel Applications
1239 -- 1250Yuzhen Zhao, Xiyu Liu. Spiking Neural P Systems With Microglia
1251 -- 1267Liang Zhang, Wenli Zheng, Kuangyu Zheng, Hongzi Zhu, Chao Li 0009, Minyi Guo. Bayesian-Driven Automated Scaling in Stream Computing With Multiple QoS Targets
1268 -- 1280Lu Zhao 0001, Fu Xiao 0001, Bo Li 0103, Jian Zhou 0009, Xiaolong Xu 0001, Yun Yang 0001. Availability-Aware Revenue-Effective Application Deployment in Multi-Access Edge Computing
1281 -- 1292Kai Zhang 0006, Jiahui Hong, Zhengying He, Yinan Jing, X. Sean Wang. AdaptChain: Adaptive Data Sharing and Synchronization for NFV Systems on Heterogeneous Architectures
1293 -- 1306Chilankamol Sunny, Satyajit Das, Kevin J. M. Martin, Philippe Coussy. CREPE: Concurrent Reverse-Modulo-Scheduling and Placement for CGRAs
1307 -- 1319Daniela Loreti, Marcello Artioli, Anna Ciampolini. Rollback-Free Recovery for a High Performance Dense Linear Solver With Reduced Memory Footprint
1320 -- 1330Sai Zhang, Li Tang 0001, Yan-Jun Liu 0003. Adaptive Neural Control for a Network of Parabolic PDEs With Event-Triggered Mechanism

Volume 35, Issue 6

707 -- 721Jiamin Fan, Kui Wu 0001, Guoming Tang, Yang Zhou, Shengqiang Huang. Taking Advantage of the Mistakes: Rethinking Clustered Federated Learning for IoT Anomaly Detection
722 -- 736Xinyi Ji, Jiankuo Dong, Tonggui Deng, Pinchang Zhang, Jiafeng Hua, Fu Xiao 0001. HI-Kyber: A Novel High-Performance Implementation Scheme of Kyber Based on GPU
737 -- 749Chiranjeb Mondal, Sanjay V. Rajopadhye. Taking RNA-RNA Interaction to Machine Peak
750 -- 763Siqi Wang, Tianyu Feng, Hailong Yang, Xin You, Bangduo Chen, Tongxuan Liu, Zhongzhi Luan, Depei Qian. AtRec: Accelerating Recommendation Model Training on CPUs
764 -- 776Wei-Mei Chen, Hsin-Hung Tsai, Joon Fong Ling. Parallel Computation of Dominance Scores for Multidimensional Datasets on GPUs
777 -- 794Zirui Liu, Yikai Zhao, Zhuochen Fan, Tong Yang 0003, Xiaodong Li, Ruwen Zhang, Kaicheng Yang, Zihan Jiang, Zheng Zhong, Yi Huang, Cong Liu, Jing Hu, Gaogang Xie, Bin Cui 0001. BurstBalancer: Do Less, Better Balance for Large-Scale Data Center Traffic
795 -- 812Jiesong Liu, Feng Zhang 0007, Lv Lu, Chang Qi, Xiaoguang Guo, Dong Deng 0001, Guoliang Li 0001, Huanchen Zhang, Jidong Zhai, Hechen Zhang, Yuxing Chen, Anqun Pan, Xiaoyong Du 0001. G-Learned Index: Enabling Efficient Learned Index on GPU
813 -- 827Amirhossein Taherpour, Xiaodong Wang. HybridChain: Fast, Accurate, and Secure Transaction Processing With Distributed Learning
828 -- 842Pourya Soltani, Farid Ashtiani. Analytical Modeling and Throughput Computation of Blockchain Sharding
843 -- 856Zheng Zhang 0036, Yaqi Xia, Hulin Wang, Donglin Yang, Chuang Hu, Xiaobo Zhou, Dazhao Cheng. MPMoE: Memory Efficient MoE for Pre-Trained Models With Adaptive Pipeline Parallelism
857 -- 873Cheng Wang, Kun Xie 0001, Jiazheng Tian, Jigang Wen, Xiaocan Li, Gaogang Xie, Kenli Li 0001. HPETC: History Priority Enhanced Tensor Completion for Network Distance Measurement
874 -- 888Kaiyang Liu, Jingrong Wang, Zhiming Huang 0002, Jianping Pan 0001. Sampling-Based Multi-Job Placement for Heterogeneous Deep Learning Clusters
889 -- 900Guoqing Xiao 0001, Chuanghui Yin, Yuedan Chen, Mingxing Duan, Kenli Li 0001. Efficient Utilization of Multi-Threading Parallelism on Heterogeneous Systems for Sparse Tensor Contraction
901 -- 918Xin Du, Minglong Wang, Zhihui Lu 0002, Qiang Duan, Yuhao Liu, Jianfeng Feng, Huarui Wang. HRCM: A Hierarchical Regularizing Mechanism for Sparse and Imbalanced Communication in Whole Human Brain Simulations
919 -- 936Dazhao Cheng, Kai Yan, Xinquan Cai, Yili Gong, Chuang Hu. SLO-Aware Function Placement for Serverless Workflows With Layer-Wise Memory Sharing
937 -- 951Chen Wang 0004, Kathryn M. Mohror, Marc Snir. Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems
952 -- 966Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu 0001, Quyang Pan, Xuefeng Jiang, Bo Gao. FedICT: Federated Multi-Task Distillation for Multi-Access Edge Computing

Volume 35, Issue 5

707 -- 719Yi-Chien Lin, Bingyi Zhang, Viktor K. Prasanna. HitGNN: High-Throughput GNN Training Framework on CPU+Multi-FPGA Heterogeneous Platform
720 -- 731Tianyu Zeng, Xiaoxi Zhang, Jingpu Duan, Chao Yu 0004, Chuan Wu 0001, Xu Chen 0004. An Offline-Transfer-Online Framework for Cloud-Edge Collaborative Distributed Reinforcement Learning
732 -- 750Yuhang Liu 0001, Xin Deng, Jiapeng Zhou, Mingyu Chen 0001, Yungang Bao. Suppressing the Interference Within a Datacenter: Theorems, Metric and Strategy
751 -- 767Enge Song, Tian Pan 0001, Haoyu Song 0001, Qiang Fu 0011, Yingjiang Liu, Chenhao Jia, Chuanying Yuan, Minglan Gao, Jiao Zhang 0002, Tao Huang 0005, Yunjie Liu 0001. INT-Label: Lightweight In-Band Network-Wide Telemetry via Distributed Labeling
768 -- 779Fan Yuan, Xiaojian Yang, Shengguo Li, Dezun Dong, Chun Huang, Zheng Wang. Optimizing Multi-Grid Preconditioned Conjugate Gradient Method on Multi-Cores
780 -- 795Yuanhong Zhang, Weizhan Zhang, Haipeng Du, Caixia Yan, Li Liu, Qinghua Zheng. FHVAC: Feature-Level Hybrid Video Adaptive Configuration for Machine-Centric Live Streaming
796 -- 813Bowen Zhang, Shengan Zheng, Liangxu Nie, Zhenlin Qi, Hongyi Chen, Linpeng Huang, Hong Mei 0001. Revisiting PM-Based B$^{+}$+-Tree With Persistent CPU Cache
814 -- 828Anshuman Misra, Ajay D. Kshemkalyani. Byzantine-Tolerant Causal Ordering for Unicasts, Multicasts, and Broadcasts
829 -- 843Dingding Li, Weijie Zhang, Mianxiong Dong, Kaoru Ota. DMA-Assisted I/O for Persistent Memory
844 -- 861Runzhou Han, Mai Zheng, Suren Byna, Houjun Tang, Bin Dong 0002, Dong Dai 0001, Yong Chen 0001, Dongkyun Kim, Joseph Hassoun, David Thorsley. PROV-IO$^+$+: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems

Volume 35, Issue 4

517 -- 530Jian Yang, Jiantong Jiang, Zeyi Wen, Ajmal Mian. Parallel and Distributed Bayesian Network Structure Learning
531 -- 547Jie Song 0001, Peimeng Zhu, Yanfeng Zhang, Ge Yu 0001. CloudSimPer: Simulating Geo-Distributed Datacenters Powered by Renewable Energy Mix
548 -- 559Jie Xu 0031, Yulong Ming, Zihan Wu, Cong Wang 0001, Xiaohua Jia. X-Shard: Optimistic Cross-Shard Transaction Processing for Sharding-Based Blockchains
560 -- 576Zhaojie Wen, Qiong Chen, Yipei Niu, Zhen Song, Quanfeng Deng, Fangming Liu. Joint Optimization of Parallelism and Resource Configuration for Serverless Function Steps
577 -- 591Dongsheng Li 0001, Shengwei Li, Zhiquan Lai, Yongquan Fu, Xiangyu Ye, Lei Cai, Linbo Qiao. A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training
592 -- 603Meilin Yang, Jian Xu, Wenbo Ding, Yang Liu. FedHAP: Federated Hashing With Global Prototypes for Cross-Silo Retrieval
604 -- 615Amanda Jayanetti, Saman K. Halgamuge, Rajkumar Buyya. Multi-Agent Deep Reinforcement Learning Framework for Renewable Energy-Aware Workflow Scheduling on Distributed Cloud Data Centers
616 -- 633Jianyuan Lu, Tian Pan 0001, Shan He, Mao Miao, Guangzhe Zhou, Yining Qi, Shize Zhang, Enge Song, Xiaoqing Sun, Huaiyi Zhao, Biao Lyu, Shunmin Zhu. CloudSentry: Two-Stage Heavy Hitter Detection for Cloud-Scale Gateway Overload Protection
634 -- 645Subhadeep Karan, Zainul Abideen Sayed, Jaroslaw Zola. End-to-End Bayesian Networks Exact Learning in Shared Memory
646 -- 662Ke Cheng, Sheng Zhang 0001, Meizhao Liu, Yingcheng Gu, Liu Wei, Huanyu Cheng, Kai Liu, Yu Song, Xiaohang Shi, Andong Zhu, Lei Tang. GeoScale: Microservice Autoscaling With Cost Budget in Geo-Distributed Edge Clouds
663 -- 674Zhe Wang 0042, Jia Hu, Geyong Min, Zhiwei Zhao, Zi Wang. Agile Cache Replacement in Edge Computing via Offline-Online Deep Reinforcement Learning
675 -- 692Wai-Kong Lee, Raymond K. Zhao, Ron Steinfeld, Amin Sakzad, Seong Oun Hwang. High Throughput Lattice-Based Signatures on GPUs: Comparing Falcon and Mitaka
693 -- 706Burak Aksar, Efe Sencan, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Brian Kulis, Manuel Egele, Ayse K. Coskun. Runtime Performance Anomaly Diagnosis in Production HPC Systems Using Active Learning

Volume 35, Issue 3

377 -- 390Di Wu, Rehmat Ullah 0001, Philip Rodgers, Peter Kilpatrick, Ivor T. A. Spence, Blesson Varghese. EcoFed: Efficient Communication for DNN Partitioning-Based Federated Learning
391 -- 404Xing Chen 0002, Shengxi Hu, Chujia Yu, Zheyi Chen, Geyong Min. Real-Time Offloading for Dependent and Parallel Tasks in Cloud-Edge Environments Using Deep Reinforcement Learning
405 -- 420Linpeng Jia, Yanxiu Liu, Keyuan Wang, Yi Sun. Estuary: A Low Cross-Shard Blockchain Sharding Protocol Based on State Splitting
421 -- 438Daoce Wang, Jesus Pulido, Pascal Grosset, Sian Jin, Jiannan Tian, Kai Zhao 0008, James P. Ahrens, Dingwen Tao. TAC+: Optimizing Error-Bounded Lossy Compression for 3D AMR Simulations
439 -- 454Weiling Yang, Jianbin Fang, Dezun Dong, Xing Su, Zheng Wang 0001. Optimizing Full-Spectrum Matrix Multiplications on ARMv8 Multi-Core CPUs
455 -- 469Guangjing Huang, Qiong Wu 0009, Peng Sun 0003, Qian Ma 0002, Xu Chen 0004. Collaboration in Federated Learning With Differential Privacy: A Stackelberg Game Analysis
470 -- 483Fatemeh Elahi, Mahmood Fazlali, Hadi Tabatabaee Malazi, Mehdi Elahi. Parallel Fractional Stochastic Gradient Descent With Adaptive Learning for Recommender Systems
484 -- 498Vinicius S. da Silva, Everton Camargo de Lima, Janaina Schwarzrock, Fábio D. Rossi, Marcelo Caggiani Luizelli, Antonio Carlos Schneider Beck, Arthur Francisco Lorenzon. Synergistically Rebalancing the EDP of Container-Based Parallel Applications
499 -- 516Jialun Li, Jieqian Yao, Danyang Xiao, Diying Yang, Weigang Wu. EvoGWP: Predicting Long-Term Changes in Cloud Workloads Using Deep Graph-Evolution Learning

Volume 35, Issue 2

203 -- 220Ajay Singh, Trevor Alexander Brown, Ali José Mashtizadeh. Simple, Fast and Widely Applicable Concurrent Memory Reclamation via Neutralization
221 -- 236Zhiyuan Wang, Hongli Xu, Yang Xu 0020, Zhida Jiang, JianChun Liu, Suo Chen. FAST: Enhancing Federated Learning Through Adaptive Data Sampling and Local Training
237 -- 249Gang Zeng, Jianfeng Zhu 0001, Yichi Zhang, Ganhui Chen, Zhenhai Yuan, Shaojun Wei, Leibo Liu. A High-Performance Genomic Accelerator for Accurate Sequence-to-Graph Alignment Using Dynamic Programming Algorithm
250 -- 263Junyan Qian, Kunzhu Qiu, Hao Ding 0007, Huimin Zhang, Zhongyi Zhai. An Efficient Bottleneck Planes Exclusion Method for Reconfiguring 3D VLSI Arrays
264 -- 279Yong Dong, Yiqin Dai, Min Xie, Kai Lu, Ruibo Wang, Juan Chen 0001, Mingtian Shao, Zheng Wang 0001. Faster and Scalable MPI Applications Launching
280 -- 296Jing Wu, Lin Wang 0015, Qirui Jin, Fangming Liu. Graft: Efficient Inference Serving for Hybrid Deep Learning With SLO Guarantees via DNN Re-Alignment
297 -- 309Ning Li, Jianmei Guo, Bo Huang 0002, Yuyang Li, Yilei Zhang, Chengdong Li, Wenxin Huang. TCSA: Efficient Localization of Busy-Wait Synchronization Bugs for Latency-Critical Applications
310 -- 323Hao-Rui Chen, Lei Yang 0024, Xinglin Zhang, Jiaxing Shen, Jiannong Cao 0001. Distributed Semi-Supervised Learning With Consensus Consistency on Edge Devices
324 -- 337Zhao Liu, XueSen Chu, Xiaojing Lv, Hongsong Meng, Hanyue Liu, Guanghui Zhu, Haohuan Fu, Guangwen Yang. SunwayLB: Enabling Extreme-Scale Lattice Boltzmann Method Based Computing Fluid Dynamics Simulations on Advanced Heterogeneous Supercomputers
338 -- 348Javier Navaridas, Markos Kynigos, Jose Antonio Pascual, Mikel Luján, José Miguel-Alonso, John Goodacre. Understanding the Impact of Arbitration in MZI-Based Beneš Switching Fabrics
349 -- 361Jiaqi Yang, Hao Zheng 0005, Ahmed Louri. Versa-DNN: A Versatile Architecture Enabling High-Performance and Energy-Efficient Multi-DNN Acceleration
362 -- 376Dongyu Zheng, Lei Liu 0003, Guoming Tang, Yi Wang 0004, Weichao Li 0001. Power Demand Reshaping Using Energy Storage for Distributed Edge Clouds

Volume 35, Issue 12

2297 -- 2314Quan Deng, Qiang Liu 0011, Ming Yuan, Xiaohui Duan, Lin Gan, Jinzhe Yang, Wenlai Zhao, Zhenxiang Zhang, Guiming Wu, Wayne Luk, Haohuan Fu, Guangwen Yang. Acceleration of Multi-Body Molecular Dynamics With Customized Parallel Dataflow
2315 -- 2330Liang Wang, Jinzhe Yang, Jidong Zhai, Guangwen Yang. Optimizing I/O Performance Through Effective vCPU Scheduling Interference Management
2331 -- 2346Shuangwu Chen, Jiangming Li, Qifeng Yuan, Huasen He, Sen Li, Jian Yang 0014. Two-Timescale Joint Optimization of Task Scheduling and Resource Scaling in Multi-Data Center System Based on Multi-Agent Deep Reinforcement Learning
2347 -- 2360Francesco De Pellegrini, Vaibhav Kumar Gupta, Rachid El Azouzi, Serigne Gueye, Cédric Richier, Jeremie Leguay. Fair Coflow Scheduling via Controlled Slowdown
2361 -- 2374Devki Nandan Jha, Yinhao Li, Zhenyu Wen, Graham Morgan, Prem Prakash Jayaraman, Maciej Koutny, Omer F. Rana, Rajiv Ranjan 0001. GeoDeploy: Geo-Distributed Application Deployment Using Benchmarking
2375 -- 2391Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li 0001, Olli Saarikivi, Saeed Maleki, Fan Yang 0024. Efficient Schedule Construction for Distributed Execution of Large DNN Models
2392 -- 2404Qiushi Zheng, Jiong Jin, Zhishu Shen, Libing Wu, Iftekhar Ahmad, Yong Xiang 0001. Distributed Task Processing Platform for Infrastructure-Less IoT Networks: A Multi-Dimensional Optimization Approach
2405 -- 2422Bingyi Zhang, Rajgopal Kannan, Carl E. Busart, Viktor K. Prasanna. VisionAGILE: A Versatile Domain-Specific Accelerator for Computer Vision Tasks
2423 -- 2434Jinyu Hu, Huizhang Luo, Hong Jiang 0001, Guoqing Xiao 0001, Kenli Li 0001. FastLoad: Speeding Up Data Loading of Both Sparse Matrix and Vector for SpMV on GPUs
2435 -- 2448Rong Hu, Haotian Wang 0006, Wangdong Yang, Renqiu Ouyang, Keqin Li 0001, Kenli Li 0001. BCB-SpTC: An Efficient Sparse High-Dimensional Tensor Contraction Employing Tensor Core Acceleration
2449 -- 2462Binghan Wu, Wei Bao 0001, Bing Bing Zhou. Competitive Analysis of Online Elastic Caching of Transient Data in Multi-Tiered Content Delivery Network
2463 -- 2478Zhenhua Guo 0003, Yinan Tang, Jidong Zhai, Tongtong Yuan, Jian Jin, Li Wang 0040, Yaqian Zhao, RenGang Li. A Survey on Performance Modeling and Prediction for Distributed DNN Training
2479 -- 2496Hui Sun 0002, Deyan Kong, Song Jiang, Yinliang Yue, Xiao Qin 0001. TrieKV: A High-Performance Key-Value Store Design With Memory as Its First-Class Citizen
2497 -- 2512Keyuan Wang, Linpeng Jia, Zhaoxiong Song, Yi Sun. Mitosis: A Scalable Sharding System Featuring Multiple Dynamic Relay Chains
2513 -- 2526Chunlin Tian, Li Li 0064, Kahou Tam, Yebo Wu, Cheng-Zhong Xu 0001. Breaking the Memory Wall for Heterogeneous Federated Learning via Model Splitting
2527 -- 2544Ruchi Bhoot, Suved Sanjay Ghanmode, Yogesh Simmhan. TARIS: Scalable Incremental Processing of Time-Respecting Algorithms on Streaming Graphs

Volume 35, Issue 11

1879 -- 1890Yichen Li 0006, Wenchao Xu 0001, Yining Qi, Haozhao Wang, Ruixuan Li 0001, Song Guo 0001. SR-FDIL: Synergistic Replay for Federated Domain-Incremental Learning
1891 -- 1903Yuezhi Che, Dazhao Cheng, Xiao Wang 0012, Rujia Wang. Opca: Enabling Optimistic Concurrent Access for Multiple Users in Oblivious Data Storage
1904 -- 1919Yaqi Xia, Zheng Zhang 0036, Donglin Yang, Chuang Hu, Xiaobo Zhou 0002, Hongyang Chen, Qianlong Sang, Dazhao Cheng. Redundancy-Free and Load-Balanced TGNN Training With Hierarchical Pipeline Parallelism
1920 -- 1935Haoran Zhou, Wei Rang, Hongyang Chen, Xiaobo Zhou 0002, Dazhao Cheng. DeepTM: Efficient Tensor Management in Heterogeneous Memory for DNN Training
1936 -- 1948Sanjay Lall, Calin Cascaval, Martin Izzard, Tammo Spalink. Logical Synchrony and the Bittide Mechanism
1949 -- 1963Peng Wang, Hong Jiang 0001, Yu Liu 0040, Zhelong Zhao, Ke Zhou 0001, Zhihai Huang. Beyond Belady to Attain a Seemingly Unattainable Byte Miss Ratio for Content Delivery Networks
1964 -- 1976Shiyu Shen, Hao Yang, Wangchen Dai, Hong Zhang, Zhe Liu 0001, Yunlei Zhao. High-Throughput GPU Implementation of Dilithium Post-Quantum Digital Signature
1977 -- 1988Hua Huang, Edmond Chow. Exploring the Design Space of Distributed Parallel Sparse Matrix-Multiple Vector Multiplication
1989 -- 2005Zhaojie Wen, Qiong Chen, Quanfeng Deng, Yipei Niu, Zhen Song, Fangming Liu. ComboFunc: Joint Resource Combination and Container Placement for Serverless Function Scaling With Heterogeneous Container
2006 -- 2022Huali Lu, Feng Lyu 0001, Ju Ren 0001, Huaqing Wu, Conghao Zhou, Zhongyuan Liu, Yaoxue Zhang, Xuemin Shen. CODE$^{+}$+: Fast and Accurate Inference for Compact Distributed IoT Data Collection
2023 -- 2038Di Mou, Bo Wang, Dajiang Liu. SC-CGRA: An Energy-Efficient CGRA Using Stochastic Computing
2039 -- 2053Yin Xu, Mingjun Xiao, Jie Wu 0001, He Sun. Privacy Preserving Task Push in Spatial Crowdsourcing With Unknown Popularity
2054 -- 2068Lan Zhang 0002, Anran Li, Hongyi Peng, Feng Han, Fan Huang, Xiang-Yang Li 0001. Privacy-Preserving Data Selection for Horizontal and Vertical Federated Learning
2069 -- 2086Kai Chen 0020, Qingjun Qu, Feng Zhu 0009, Zhengming Yi, Wenjie Tang. CPLNS: Cooperative Parallel Large Neighborhood Search for Large-Scale Multi-Agent Path Finding
2087 -- 2101Qiqi Duan, Chang Shao, Guochen Zhou, Minghan Zhang, Qi Zhao, Yuhui Shi. Distributed Evolution Strategies With Multi-Level Learning for Large-Scale Black-Box Optimization
2102 -- 2113Ping Luo 0002, Jieren Cheng, Neal Xiong 0001, Zhenhao Liu, Jie Wu 0001. FedVeca: Federated Vectorized Averaging on Non-IID Data With Adaptive Bi-Directional Global Objective
2114 -- 2131Hui Dou, Yilun Wang, Yiwen Zhang 0001, Pengfei Chen 0002, Zibin Zheng. +: A Low-Cost and Transferrable Online Configuration Auto-Tuning Approach for Big Data Frameworks
2132 -- 2146Biao Hou, Song Yang 0002, Fan Li 0001, Liehuang Zhu, Lei Jiao 0002, Xu Chen 0004, Xiaoming Fu 0001. Gamora: Learning-Based Buffer-Aware Preloading for Adaptive Short Video Streaming
2147 -- 2160Feng Yao, Qian Tao, Shengyuan Lin, Yanfeng Zhang, Wenyuan Yu, Shufeng Gong, Qiange Wang, Ge Yu 0001, Jingren Zhou. Towards Efficient Graph Processing in Geo-Distributed Data Centers
2161 -- 2176Darong Huang, Luis Costero, David Atienza. An Evaluation Framework for Dynamic Thermal Management Strategies in 3D MultiProcessor System-on-Chip Co-Design
2177 -- 2192Rui Tian, Jiazhi Jiang, Jiangsu Du, Dan Huang, Yutong Lu. Sophisticated Orchestrating Concurrent DLRM Training on CPU/GPU Platform
2193 -- 2207Donglei Wu, Weihao Yang, Xiangyu Zou, Hao Feng, Dingwen Tao, Shiyi Li, Wen Xia, Binxing Fang. BIRD+: Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms
2208 -- 2223Yuyang Jin, Runxin Zhong, Saiqin Long, Jidong Zhai. Efficient Inference for Pruned CNN Models on Mobile Devices With Holistic Sparsity Alignment
2224 -- 2238Shouxi Luo, Renyi Wang, Ke Li 0020, Huanlai Xing. Efficient Cross-Cloud Partial Reduce With CREW
2239 -- 2253Renyou Xie, Chaojie Li, Xiaojun Zhou, Zhaoyang Dong. Accelerating Communication-Efficient Federated Multi-Task Learning With Personalization and Fairness
2254 -- 2269Hanfei Yu, Hao Wang 0022, Jian Li 0008, Xu Yuan 0001, Seung-Jong Park. Freyr $^+$+: Harvesting Idle Resources in Serverless Computing via Deep Reinforcement Learning
2270 -- 2283Jiandong Liu, Lan Zhang 0002, Fengxiang He, Chi Zhang 0043, Shanyang Jiang, Xiang-Yang Li 0001. Communication-Efficient Regret-Optimal Distributed Online Convex Optimization
2284 -- 2296Renwen Ma, Kai Hwang 0001, Mo Li, Yiming Miao. Trusted Model Aggregation With Zero-Knowledge Proofs in Federated Learning

Volume 35, Issue 10

1708 -- 1720Jiaxing Qi, Wencong Xiao, Mingzhen Li, Chaojie Yang, Yong Li, Wei Lin, Hailong Yang, Zhongzhi Luan, Depei Qian. ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG
1810 -- 1825Rong Chen 0001, Xingda Wei, Xiating Xie, Haibo Chen 0001. Locality-Preserving Graph Traversal With Split Live Migration
1826 -- 1840Mi Zhang 0007, Qihan Kang, Patrick P. C. Lee. FlexRaft: Exploiting Flexible Erasure Coding for Minimum-Cost Consensus and Fast Recovery
1841 -- 1853Peixuan Li, Ping Xie, Qiang Cao 0001. SSRAID: A Stripe-Queued and Stripe-Threaded Merging I/O Strategy to Improve Write Performance of Serial Interface SSD RAID
1854 -- 1866Fatemeh Keshavarz-Kohjerdi. Paired Many-to-Many 2-Disjoint Path Covers in Meshes
1867 -- 1878Jiangfei Duan, Xiuhong Li, Ping Xu, Xingcheng Zhang, Shengen Yan, Yun Liang 0001, Dahua Lin. Proteus: Simulating the Performance of Distributed DNN Training

Volume 35, Issue 1

1 -- 19Qiufen Xia, Zhiwei Jiao, Zichuan Xu. Online Learning Algorithms for Context-Aware Video Caching in D2D Edge Networks
20 -- 33Qingxiao Sun, Yi Liu 0013, Hailong Yang, Zhonghui Jiang, Zhongzhi Luan, Depei Qian. Adaptive Auto-Tuning Framework for Global Exploration of Stencil Optimization on GPUs
34 -- 45Qixiang Chen, Zhijun Chen, Kai Zhang 0006, X. Sean Wang. CLIC: An Extensible and Efficient Cross-Platform Data Analytics System
46 -- 58Yuan Li 0029, Ahmed Louri, Avinash Karanth. A High-Performance and Energy-Efficient Photonic Architecture for Multi-DNN Acceleration
59 -- 72Fangming Liu, Yipei Niu. Demystifying the Cost of Serverless Computing: Towards a Win-Win Deal
73 -- 88Isam Mashhour Al Jawarneh, Paolo Bellavista, Antonio Corradi, Luca Foschini 0001, Rebecca Montanari. SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor
89 -- 104Zhe Jiang 0004, Kecheng Yang 0001, Nathan Fisher, Nan Guan, Neil C. Audsley, Zheng Dong 0002. Hopscotch: A Hardware-Software Co-Design for Efficient Cache Resizing on Multi-Core SoCs
105 -- 122Zichuan Xu, Guangyuan Xu, Hao Wang, Weifa Liang, Qiufen Xia, Shangguang Wang. Enabling Streaming Analytics in Satellite Edge Computing via Timely Evaluation of Big Data Queries
123 -- 139Yunqi Gao, Bing Hu 0002, Mahdi Boloursaz Mashhadi, A-Long Jin, Pei Xiao, Chunming Wu 0001. US-Byte: An Efficient Communication Framework for Scheduling Unequal-Sized Tensor Blocks in Distributed Deep Learning
140 -- 153Changlong Li, Yu Liang 0004, Liang Shi, Chao Wang 0003, Chun Jason Xue, Xuehai Zhou. Flexible and Efficient Memory Swapping Across Mobile Devices With LegoSwap
154 -- 168Qiliang Li, Liangliang Xu, Yongkun Li 0001, Min Lyu, Wei Wang, Pengfei Zuo, Yinlong Xu. Enabling Efficient Erasure Coding in Disaggregated Memory Systems
169 -- 185Tiangang Li, Shi Ying, Yishi Zhao, Jianga Shang. Batch Jobs Load Balancing Scheduling in Cloud Computing Using Distributional Reinforcement Learning
186 -- 202Yaozheng Fang, Zhiyuan Zhou, Surong Dai, Jinni Yang, Hui Zhang 0002, Ye Lu. PaVM: A Parallel Virtual Machine for Smart Contract Execution and Validation