1109 | -- | 1122 | Zerun Li, Xiaoming Chen 0003, Yuxin Yang 0002, Feng Min, Xiaoyu Zhang 0009, Yinhe Han 0001. A Data-Centric Software-Hardware Co-Designed Architecture for Large-Scale Graph Processing |
1123 | -- | 1137 | Xingyan Chen, Tian Du, Mu Wang, Tiancheng Gu, Yu Zhao 0019, Gang Kou, Changqiao Xu, Dapeng Oliver Wu. Towards Optimal Customized Architecture for Heterogeneous Federated Learning With Contrastive Cloud-Edge Model Decoupling |
1138 | -- | 1151 | Jinquan Wang, Zhisheng Huo, Limin Xiao, Jinqian Yang, Jiantong Huo, Minyi Guo. Hierarchical Hashing: A Dynamic Hashing Method With Low Write Amplification and High Performance for Non-Volatile Memory |
1152 | -- | 1167 | Jinkai Zhang, Yinghao Yang, Zhe Zhou, Zhicheng Hu, Xin Zhao, Liang Chang 0002, Hang Lu, Xiaowei Li 0001. Trident: The Acceleration Architecture for High-Performance Private Set Intersection |
1168 | -- | 1181 | Mingyuan Zhao, Hao Sheng 0001, Rongshan Chen, Ruixuan Cong, Tun Wang, Zhenglong Cui, Da Yang 0001, Shuai Wang 0027, Wei Ke 0001. A GPU-Enabled Framework for Light Field Efficient Compression and Real-Time Rendering |
1182 | -- | 1195 | Haotian Wang 0006, Yan Ding 0004, Yumeng Liu, Weichen Liu, Chubo Liu, Wangdong Yang, Kenli Li 0001. A Context-Awareness and Hardware-Friendly Sparse Matrix Multiplication Kernel for CNN Inference Acceleration |
1196 | -- | 1209 | Weijie Liu, Kai Lu, Zhiquan Lai, Shengwei Li, Keshi Ge, Dongsheng Li 0001, Xicheng Lu. AutoPipe-H: A Heterogeneity-Aware Data-Paralleled Pipeline Approach on Commodity GPU Servers |
1210 | -- | 1223 | Kejun Guo, Fuliang Li, Jiaxing Shen, Xing-Wei Wang 0001, Jiannong Cao 0001. Distributed Sketch Deployment for Software Switches |
1224 | -- | 1238 | Yingxue Gao, Teng Wang, Lei Gong, Chao Wang, Dong Dai 0001, Yang Yang 0080, Xianglan Chen, Xi Li, Xuehai Zhou. Hardware Accelerated Vision Transformer via Heterogeneous Architecture Design and Adaptive Dataflow Mapping |
1239 | -- | 1252 | Yunho Jang, Dongsu Kim, Yeseul Kim, Jongsun Park 0001. Big-Computing and Little-Storing STT-MRAM PIM Architecture With Charge Domain Based MAC Operation |
1253 | -- | 1266 | Ke Xu, Ming Tang 0002, Quancheng Wang, Han Wang. Microarchitectural Attacks and Mitigations on Retire Resources in Modern Processors |
1267 | -- | 1277 | Chun Huang, Jiaying Shao, Baolei Peng, Qingshuang Guo, Panlong Li, Junwei Sun, Yanfeng Wang. Design of a Universal Decoder Model Based on DNA Winner-Takes-All Neural Networks |
1278 | -- | 1292 | Giusy Iaria, Paolo Bernardi, Claudia Bertani, Lorenzo Cardone, Giuseppe Garozzo, Vincenzo Tancorre. A Comprehensive Scan Test Cost Model to Optimize the Production of Very Large SoCs |
1293 | -- | 1305 | Jaeil Lim, Jaewon Chung, Donghun Jeong, Daegeun Jee, Euicheol Lim. A New ECC Configuration Method for DRAM System Considering Metadata |
1306 | -- | 1321 | Kaijie Wei, Hideharu Amano, Ryohei Niwase, Yoshiki Yamaguchi, Takefumi Miyoshi. Qu-Trefoil: Large-Scale Quantum Circuit Simulator Working on FPGA With SATA Storages |
1322 | -- | 1333 | Yulong Li, Wenxin Li 0001, Yuxuan Du, Yinan Yao, Song Zhang, Linxuan Zhong, Keqiu Li. Flexible Job Scheduling With Spatial-Temporal Compatibility for In-Network Aggregation |
1334 | -- | 1347 | Sirong Zhao, Guoqi Xie, Chenglai Xiong, Kenli Li 0001, Xuejun Yu, Bo Wan, Yiwen Jiang. AVL Function Table for LeafHooks Insertion With Obfuscated Control Flow Integrity |
1348 | -- | 1361 | Peyman Dehghanzadeh, Ovishake Sen, Baibhab Chatterjee, Swarup Bhunia. LUNA-CiM: A Programmable Compute-in-Memory Fabric for Neural Network Acceleration |
1362 | -- | 1376 | Jin Ye, Yajun Peng, Yijun Li 0002, Zhaoyi Li, Jiawei Huang 0001. Asynchronous Control Based Aggregation Transport Protocol for Distributed Deep Learning |
1377 | -- | 1391 | Trevor E. Pogue, Nicola Nicolici. Karatsuba Matrix Multiplication and Its Efficient Custom Hardware Implementations |
1392 | -- | 1404 | Guangkuo Yang, Meng Zhang 0014, Peng Guo, Xuepeng Zhan, Shaoqi Yang, Xiaohuan Zhao, Xinyi Guo, Pengpeng Sang, Jixuan Wu, Fei Wu 0005, Jiezhi Chen. High-Precision Error Bit Prediction for 3D QLC NAND Flash Memory: Observations, Analysis, and Modeling |
1405 | -- | 1417 | Li Yang, Wei Zhang, Yinbin Miao, Yanrong Liang, Xinghua Li 0001, Kim-Kwang Raymond Choo, Robert H. Deng. Secure and Efficient Cross-Modal Retrieval Over Encrypted Multimodal Data |
1418 | -- | 1430 | Abdulbary Naji, Xingfu Wang, Ping Liu 0008, Ammar Hawbani, Liang Zhao 0004, Xiaohua Xu 0002, Fuyou Miao. NetCRC-NR: In-Network 5G NR CRC Accelerator |
1431 | -- | 1445 | Zhixin Zhao, Yitao Hu, Guotao Yang, Ziqi Gong, Chen Shen, Laiping Zhao, Wenxin Li 0001, Xiulong Liu, Wenyu Qu. SLOpt: Serving Real-Time Inference Pipeline With Strict Latency Constraint |
1446 | -- | 1460 | Vasileios Titopoulos, Kosmas Alexandridis, Christodoulos Peltekis, Chrysostomos Nicopoulos, Giorgos Dimitrakopoulos. Optimizing Structured-Sparse Matrix Multiplication in RISC-V Vector Processors |
1461 | -- | 1469 | Argyris Kokkinis, Georgios Zervakis 0001, Kostas Siozios, Mehdi Baradaran Tahoori, Jörg Henkel. Enabling Printed Multilayer Perceptrons Realization via Area-Aware Neural Minimization |