Abstract is missing.
- Architectures for AISteven K. Reinhardt. 1 [doi]
- FlightVGM: Efficient Video Generation Model Inference with Online Sparsification and Hybrid Precision on FPGAsJun Liu, Shulin Zeng, Li Ding 0012, Widyadewi Soedarmadji, Hao Zhou, Zehao Wang, Jinhao Li 0006, Jintao Li, Yadong Dai, Kairui Wen, Shan He, Yaqi Sun, Yu Wang, Guohao Dai. 2-13 [doi]
- TreeLUT: An Efficient Alternative to Deep Neural Networks for Inference Acceleration Using Gradient Boosted Decision TreesAlireza Khataei, Kia Bazargan. 14-24 [doi]
- Greater than the Sum of its LUTs: Scaling Up LUT-based Neural Networks with AmigoLUTOlivia Weng, Marta Andronic, Danial Zuberi, Jiaqing Chen, Caleb Geniesse, George A. Constantinides, Nhan Tran, Nicholas J. Fraser, Javier Mauricio Duarte, Ryan Kastner. 25-35 [doi]
- ReducedLUT: Table Decomposition with "Don't Care" ConditionsOliver Cassidy, Marta Andronic, Samuel Coward, George A. Constantinides. 36-42 [doi]
- wa-hls4ml and lui-gnn: A Benchmark and GNN based Surrogate Model for hls4ml Resource and Latency EstimationBenjamin Hawks, Dennis Plotnikov, Nhan Tran, Karla Tame-Narvaez, Mohammad Mehdi Rahimifar, Hamza Ezzaoui Rahali, Audrey C. Therrien, Giuseppe Di Guglielmo, Javier Duarte, Vladimir Loncar. 43 [doi]
- InTRRA: Inter-Task Resource-Repurposing Accelerator for Efficient Transformer Inference on FPGAsZifan He, Hersh Gupta, Huifeng Ke, Jason Cong. 44 [doi]
- DPUV4E: High-Throughput DPU Architecture Design for CNN on Versal ACAPGuoyu Li, Pengbo Zheng, Jian Weng, Enshan Yang. 45 [doi]
- Performance Analysis of GEMM Workloads on the AMD Versal PlatformKaustubh Manohar Mhatre, Venkata Guru Prasanth Mulleti, Curt John Bansil, Endri Taka, Aman Arora. 46 [doi]
- HiGTR: High-Performance FPGA Implementation of Complete GNN-based Trajectory Reconstruction for HEPYun-Chen Yang, Hsuan-Wei Yu, Bo-Cheng Lai, Shih-Chieh Hsu, Mark S. Neubauer, Santosh Parajuli. 47 [doi]
- FPGA Implementation of a 1D-CNN Modulation Classifier for Radar SignalsEdgard Cansio. 48 [doi]
- RRNS Arith Lib - An Open-Source Redundant Residue Number System Arithmetic VHDL LibraryTim Oberschulte, Enno Sievers, Holger Blume. 49 [doi]
- Resource Scheduling for Real-Time Machine LearningSuyash Vardhan Singh, Iftakhar Ahmad, David Andrews 0001, Miaoqing Huang, Austin R. J. Downey, Jason D. Bakos. 50 [doi]
- BAQET: BRAM-aware Quantization for Efficient Transformer Inference via Stream-based Architecture on an FPGALingChi Yang, Chi-Jui Chen, Trung Le, Bo-Cheng Lai, Scott Hauck, Shih-Chieh Hsu. 51 [doi]
- Neural Network Inference in High-Performance Computing: Closing the Gap for FINN based Reconfigurable AcceleratorsLinus Jungemann, Bjarne Wintermann, Heinrich Riebler, Christian Plessl. 52 [doi]
- An Empirical Comparision of LLM-based Hardware Design and High-level SynthesisFan Cui, Youwei Xiao, Kexing Zhou, Yun Liang 0001. 53 [doi]
- FPGA-Oriented Design Space Exploration of a Real-Time Road Scene Semantic Segmentation Deep Neural NetworkHugo Le Blevec, Mathieu Léonardon, Stefan Weithoffer, Matthieu Arzel. 54 [doi]
- FMC-LLM: Enabling FPGAs for Efficient Batched Decoding of 70B+ LLMs with a Memory-Centric Streaming ArchitectureWenheng Ma, Xinhao Yang, Shulin Zeng, Tengxuan Liu, Libo Shen, Hongyi Wang, Shiyao Li, Jiewen Wang, Yuhan Zhang, Hao Guo, Jintao Li, Ziming Zhang, Zhenhua Zhu, Xuefei Ning, Tsung-Yi Ho, Guohao Dai, Yu Wang 0002. 55 [doi]
- Hercules: Efficient Verification of High-Level Synthesis Designs with FPGA AccelerationShuoxiang Xu, Zijian Jiang, Yuxin Zhang, David Boland, Yungang Bao, Kan Shi. 56-66 [doi]
- An Efficient Traversal Method for FPGA Interconnect Testing Based on Regular RoutingWenwei Chen, Lin Ye, Xiaotong Zhao, Tongshu Ding, Jian Wang, Jinmei Lai. 67-77 [doi]
- Two-Phase Transistor Sizing for FPGAs via Bayesian OptimizationXianfeng Cao, Huizhen Kuang, Yuanqi Wang, Lingli Wang. 78-84 [doi]
- TAPCA: An Interface-Aware Cache Management Framework for Task Partitioning on CPU-FPGA SoC PlatformsEnlai Li, Zhe Lin 0007, Sharad Sinha, Wei Zhang 0012. 85-91 [doi]
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI EnginesJinming Zhuang, Shaojie Xiang, Hongzheng Chen, Niansong Zhang, Zhuoping Yang, Tony Mao, Zhiru Zhang, Peipei Zhou 0001. 92-102 [doi]
- Stream-HLS: Towards Automatic Dataflow AccelerationSuhail Basalama, Jason Cong. 103-114 [doi]
- FAST: FPGA Acceleration of Fully Homomorphic Encryption with Efficient BootstrappingZhihan Xu, Tian Ye 0002, Rajgopal Kannan, Viktor K. Prasanna. 115-126 [doi]
- OLA: An FPGA-based Overlay Accelerator for Privacy Preserving Machine Learning with Homomorphic EncryptionYang Yang 0111, Rajgopal Kannan, Viktor K. Prasanna. 127-138 [doi]
- CIVIC-FPGA: A Trusted FPGA Design Validation by Multi-Tenant Cloud ProvidersYu Feng, Zhaoqi Wang, Christophe Bobda. 139-145 [doi]
- Lessons from 40 Years of Reconfigurable ComputingJohn Wawrzynek. 146 [doi]
- FRIDA: Reconfigurable Arrays for Dynamically Scheduled High-Level SynthesisLouis Coulon, Lucas Ramirez, Jason Helge Anderson, Mirjana Stojilovic, Paolo Ienne. 147-158 [doi]
- Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI AccelerationEndri Taka, Ning-Chi Huang, Chi-Chih Chang, Kai-Chiang Wu, Aman Arora, Diana Marculescu. 159-171 [doi]
- Tile-Level Pipeline for Linear Scalable Stencil Computation on AMD AI EnginesZhenyu Xu, Miaoxiang Yu, Yazhe Zhang, Jillian Cai, Qing Yang, Tao Wei. 172-178 [doi]
- Enhancing FPGAs with Analog In-Memory Computing MacrosArchit Gajjar, Lei Zhao, Omar Eldash, Aishwarya Natarajan, Rand Jean, Xia Sheng, Giacomo Pedretti, Paolo Faraboschi, Jim Ignowski, Luca Buonanno. 179 [doi]
- Measuring the Minimum Power Requirement of FPGA Architectural SpecificationsAndy Gean Ye, Anas Razzaq. 180 [doi]
- Towards Accelerator Customization in Real-time Safety-critical SystemsShixin Ji, Xingzhen Chen, Wei Zhang, Zhuoping Yang, Jinming Zhuang, Sarah Schultz, Yukai Song, Jingtong Hu, Alex K. Jones, Zheng Dong, Peipei Zhou 0001. 181 [doi]
- PipeLink: A Pipelined Resource Sharing System for Dataflow High-Level SynthesisRui Li, Rajit Manohar. 182 [doi]
- High Throughput Low Latency Network Intrusion Detection on FPGAs: A Raw Packet ApproachMuhammad Ali Farooq, Abid Rafique, Suhaib A Fahmy, Aman Arora. 183 [doi]
- HEDWIG: Homomorphic Encryption Accelerator Design Using BFV-HPS With HiGh-Speed Fixed-Point ApproximationAntian Wang, Weihang Tan, Zhenyu Xu, Tao Wei, Caiwen Ding, Keshab K. Parhi, Yingjie Lao. 184 [doi]
- Seamless Acceleration of Fortran Intrinsics via AMD AI EnginesNick Brown 0002, Gabriel Rodriguez-Canal. 185 [doi]
- No Time to Lose: Enabling Real-Time Fluorescence Lifetime Imaging on Resource-constrained FPGAs Through Efficient SchedulingIsmail Erbas, Aporva Amarnath, Vikas Pandey, Karthik Swaminathan, Naigang Wang, Xavier Intes. 186 [doi]
- A Unified Framework for Automated Code Transformation and Pragma InsertionStéphane Pouget, Louis-Noël Pouchet, Jason Cong. 187-198 [doi]
- Latency Insensitivity Testing for Dataflow HLS DesignsJianyi Cheng, Lianghui Wang, Zijian Jiang, Yungang Bao, Kan Shi. 199-210 [doi]
- Dynamic Loop Fusion in High-Level SynthesisRobert Szafarczyk, Syed Waqar Nabi, Wim Vanderbauwhede. 211-222 [doi]
- HUMA: Heterogeneous, Ultra Low-Latency Model Accelerator for The Virtual Brain on a Versal Adaptive SoCAmirreza Movahedin, Lennart P. L. Landsmeer, Christos Strydis. 223-233 [doi]
- SAT-Accel: A Modern SAT Solver on a FPGAMichael Lo, Mau-Chung Frank Chang, Jason Cong. 234-246 [doi]
- FPGA-Only Implementation of MIPI C-PHY Receiver Using Blind Oversampling CDR for CMOS Image SensorsJun Yeon Won, Shinki Jeong, Seongkwan Lee, Minho Kang, Insu Yang, Jaemoo Choi. 247-256 [doi]