Abstract is missing.
- AutoNTT: Automatic Architecture Design and Exploration for Number Theoretic Transform Acceleration on FPGAsDilshan Kumarathunga, Qilin Hu, Zhenman Fang. 1-9 [doi]
- FPGA-Based Approximate Multiplier for FP8Ruiqi Chen 0001, Yangxintong Lyu, Han Bao, Jiayu Liu, Yanxiang Zhu, Shidi Tang, Ming Ling, Bruno da Silva 0001. 1-9 [doi]
- Reconfigurable Retina-Inspired Looming DetectionJason Sinaga, Shay Snyder, Md. Abdullah-Al Kaiser, Dan Jinoy, Gregory Schwartz, Maryam Parsa, Akhilesh Jaiswal 0001. 1-7 [doi]
- Defending Side-Channel Attacks in Convolutional Neural Networks with Channel-Level ParallelizationYankun Zhu, Ranxi Lin, Pingqiang Zhou. 1 [doi]
- RealProbe: An Automated and Lightweight Performance Profiler for In-FPGA Execution of High-Level Synthesis DesignsJiho Kim, Cong Hao. 10-18 [doi]
- HP-FFT: A General High-Performance FFT Generator Using High-Level SynthesisChengyue Wang 0002, Jiahao Zhang, Yingquan Wu, Jason Cong. 19-23 [doi]
- FREEDOM: FPGA-Based Hardware Redaction EmulatorChaital G. Sathe, Yiorgos Makris, Benjamin Carrion Schafer. 24-28 [doi]
- HBMex: An Attachment for Nonbursting Accelerators to Enhance HBM PerformanceCanberk Sönmez, Mohamed Shahawy, Paolo Ienne. 29-37 [doi]
- High Throughput Matrix Transposition on HBM-Enabled FPGAsYang Yang 0111, Rajgopal Kannan, Viktor K. Prasanna. 38-46 [doi]
- Banked Memories for Soft SIMT ProcessorsMartin Langhammer, George A. Constantinides. 47-55 [doi]
- Soaring with TRILLI: An HW/SW Heterogeneous Accelerator for Multi-Modal Image RegistrationGiuseppe Sorrentino, Paolo Salvatore Galfano, Eleonora D'Arnese, Davide Conficconi. 56-65 [doi]
- HighWave: Large-Scale High-Bandwidth Wave Simulations on FPGAsDimitrios Gourounas, Austin G. James, Bagus Hanindhito, Arash Fathi, Lizy K. John, Andreas Gerstlauer. 66-74 [doi]
- SMART: High-Performance SAR ATR Through Model-Architecture Co-Design on FPGASachini Wickramasinghe, Yi-Chien Lin, Cauligi S. Raghavendra, Viktor K. Prasanna. 75-84 [doi]
- Efficiency, Expressivity, and Extensibility in a Close-to-Metal NPU Programming InterfaceErika Hunhoff, Joseph Melber, Kristof Denolf, Andra Bisca, Samuel Bayliss, Stephen Neuendorffer, Jeff Fifield, Jack Lo, Pranathi Vasireddy, Phil James-Roxby, Eric Keller. 85-94 [doi]
- Efficient and Distributed Computation of Electron Repulsion Integrals on AMD AI EnginesJohannes Menzel, Christian Plessl. 95-104 [doi]
- Chronbench: An Incremental HDL Benchmark SuiteZakary Nafziger, Steven J. E. Wilton. 105-113 [doi]
- ITERA-LLM: Boosting Sub-8-Bit Large Language Model Inference via Iterative Tensor DecompositionYinting Huang, Keran Zheng, Zhewen Yu, Christos-Savvas Bouganis. 114-122 [doi]
- InTAR: Inter-Task Auto-Reconfigurable Accelerator Design for High Data Volume Variation in DNNsZifan He, Anderson Truong, Yingqi Cao, Jason Cong. 123-132 [doi]
- LLM4DV: Using Large Language Models for Hardware Test Stimuli GenerationZixi Zhang, Balint Szekely, Pedro Gimenes, Greg Chadwick, Hugo McNally, Jianyi Cheng, Robert D. Mullins, Yiren Zhao. 133-137 [doi]
- SoftCUDA: Running CUDA on Softcore GPUChihyo Ahn, Ruobing Han, Udit Subramanya, Jisheng Zhao, Blaise Tine, Hyesoon Kim. 138-142 [doi]
- Guaranteed Yet Hard to Find: Uncovering FPGA Routing Convergence ParadoxShashwat Shrivastava, Stefan Nikolic 0001, Sun Tanaka, Chirag Ravishankar, Dinesh Gaitonde, Mirjana Stojilovic. 143-151 [doi]
- N-TORC: Native Tensor Optimizer for Real-Time ConstraintsSuyash Vardhan Singh, Iftakhar Ahmad, David Andrews 0001, Miaoqing Huang, Austin R. J. Downey, Jason D. Bakos. 152-161 [doi]
- NoH: NoC Compilation in High-Level SynthesisHuifeng Ke, Sihao Liu, Licheng Guo, Zifan He, Linghao Song, Suhail Basalama, Yuze Chi, Tony Nowatzki, Jason Cong. 162-171 [doi]
- A Partitioning-Based CAD Flow for Interposer-Based Multi-Die FPGAsMahesh A. Iyer, Andrew B. Kahng, Jason Luu, Bodhisatta Pramanik, Kristofer Vorwerk, Grace Zgheib. 172-180 [doi]
- Transfer Learning on the Edge for a Wireless Application Using an SoC PlatformYiyue Jiang, John Dooley, Aidan Edward Colgan, Jonathan Guimaraes Ribeiro, Zhilin Ren, Miriam Leeser. 181-188 [doi]
- Moyogi: A Memory-Centric Accelerator for Low-Latency Random Forest Inference on Embedded DevicesAlessandro Verosimile, Francesco Peverelli, Marco D. Santambrogio. 189-197 [doi]
- IceSpy: Reconfigurable Edge Accelerator for Scalable and Private Structural Health MonitoringAlexandra Zhang Jiang, Jonathan Ta, Yuqiao Li, Zhou Li, Nalini Venkatasubramanian, Monica D. Kohler, Sang-Woo Jun. 198-207 [doi]
- NeuraLUT-Assemble: Hardware-Aware Assembling of Sub-Neural Networks for Efficient LUT InferenceMarta Andronic, George A. Constantinides. 208-216 [doi]
- An Efficient FPGA-Based Hardware Accelerator of Fully Quantized Mamba-2Kailing Zhou, Han Jiao 0003, Wenjin Huang, Yihua Huang 0005. 217-226 [doi]
- A FeFET-Based Compute-in-Memory Architecture on FPGA for Neural Network InferenceMinghan Jiang, Yonggen Li, Rui Xiao, Haibin Shen, Kejie Huang. 236-242 [doi]
- Compute-In-Memory on FPGAs for Deep Learning: A ReviewAman Arora 0001. 243-253 [doi]
- Toward Reconfigurable In-Pixel Computing: A Fault-Tolerant Design Flow for Machine Learning Accelerators: (Invited Paper)Houxuan Guo, Manuel Blanco Valentin, Xiuyuan He, Seda Ogrenci. 261-267 [doi]
- TrackGNN: A Highly Parallelized and Self-Adaptive GNN Accelerator for Track Reconstruction on FPGAsShuyang Li, Hanqing Zhang, Ruiqi Chen 0001, Bruno da Silva 0001, Giorgian Borca-Tasciuc, Dantong Yu, Cong Hao. 269 [doi]
- SparseLUT: Sparse Connectivity Optimization for Lookup Table-Based Deep Neural NetworksBinglei Lou, Ruilin Wu, Philip Leong. 270 [doi]
- Unlocking the AMD Neural Processing Unit for ML Training on the Client Using Bare-Metal-Programming ToolsAndré Rösti, Michael Franz. 271 [doi]
- An Energy-Efficient FPGA-Based Vision Transformer Accelerator via Software-Hardware Co-DesignJiacheng Cao, Jiaqi Guo, Wei Xiong, Huanlin Luo, Jian Wang, Jinmei Lai. 272 [doi]
- C2OPU: Hybrid Compute-in-Memory and Coarse-Grained Reconfigurable Architecture for Overlay Processing of TransformersSiyuan Miao, Lingkang Zhu, Chen Wu, Shaoqiang Lu, Jinming Lyu, Lei He. 273 [doi]
- UltraFormer: An Efficient Transformer for FPGAsVictor Agostinelli, Nicolas Bohm Agostini, Antonino Tumeo. 274 [doi]
- Microscaling Vision Transformers on FPGAsCan Xiao, Jianyi Cheng, Aaron Zhao. 275 [doi]
- BiKA: Binarized KAN-inspired Neural Network for Efficient Hardware Accelerator DesignsYuhao Liu, Salim Ullah, Akash Kumar 0001. 276 [doi]
- Multi-FPGA Synchronization and Data Communication for Quantum Control and MeasurementYilun Xu, Abhi D. Rajagopala, Neelay Fruitwala, Gang Huang. 277 [doi]
- RapidPnR: Accelerating the Physical Design for FPGAs via Design-Level ParallelismWanzheng Weng, Pingqiang Zhou. 278 [doi]
- A High-Throughput Implementation of the MUSIC Algorithm Using AMD Versal AI EnginePeifang Zhou, Bachir Berkane, Vlad Druz, Mark Rollins. 279 [doi]
- APR-OIS: A Near-Sensor Point Cloud Pre-Processing Accelerator on FPGAYiming Gao, Herman Lam. 280 [doi]
- Optimized Coding and Parameter Selection for Efficient FPGA Design of Attention MechanismsEhsan Kabir, Austin R. J. Downey, Jason D. Bakos, David Andrews 0001, Miaoqing Huang. 281 [doi]
- On Improving the HLS Compatibility of Large C/C++ Code RegionsTiago Santos, João Bispo, João M. P. Cardoso, James C. Hoe. 282 [doi]
- Accelerating Scientific Model Optimization with a Pipelined FPGA-Based Differential Evolution EngineManuel de Castro, Roberto R. Osorio, Yuri Torres, Diego R. Llanos. 283 [doi]
- ASTEF: FPGA-Based Enhancement of Event Camera Performance in Low-Light ConditionsZhaoqi Wang, Peter Mbua, Christophe Bobda. 284 [doi]
- RV-ESMC: Efficient Sparse Matrix Convolution Processor based on RISC-V Custom instructions for Edge PlatformsHuachen Zhang, Jianyang Ding, Bowen Jiang, Tianshuo Lu, Wei Xu, ZhiLei Chai. 285 [doi]
- EVO-QNN: Efficient Mixed-Precision Quantization Inference on RISC-V-Based Edge DeviceTianshuo Lu, Jianyang Ding, Huachen Zhang, Bowen Jiang, Wei Xu, ZhiLei Chai. 286 [doi]
- DRSA: Accelerating Macro Placement on Commercial FPGAsMenzo Bouaissi, Paolo Ienne, Lana Josipovic, Andrea Guerrieri. 287 [doi]
- Low-Latency FFT/iFFT RTL Implementation for the FALCON Post-Quantum Signature AlgorithmAlexandre Ortega, Lilian Bossuet, Brice Colombier. 288 [doi]
- Breaking New Ground: Division Directly in MemoryFarzad Razi, Mehran Shoushtari Moghadam, M. Hassan Najafi, Sercan Aygun, Marc D. Riedel. 289 [doi]
- Analog In-Memory Computing Enhanced FPGA for High-Throughput and Energy-Efficient AccelerationArchit Gajjar, Lei Zhao, Omar Eldash, Aishwarya Natarajan, Xia Sheng, Giacomo Pedretti, Aman Arora 0001, Paolo Faraboschi, Jim Ignowski, Luca Buonanno. 290 [doi]
- BCIM: A Bit-Serial Approach for Block-Cipher-in-MemoryAndrew Dervay, Omar Al Kailani, Wenfeng Zhao. 291 [doi]
- LLM-IMC: Automating Analog In-Memory Computing Architecture Generation with Large Language ModelsDeepak Vungarala, Md Hasibul Amin, Pietro Mercati, Arman Roohi, Ramtin Zand, Shaahin Angizi. 292 [doi]
- A Multimodal AI Acceleration with Dynamic Pruning and Run-Time ConfigurationHyun Woo Oh, Hanning Chen, Sanggeon Yun, Yang Ni 0001, Behnam Khaleghi, Fei Wen, Mohsen Imani. 293 [doi]
- Performance Modeling and Comparisons of an FPGA-based Direct Simulation Monte Carlo SolverSaleen Bhattarai, Edwin Peters, Sean O'Byrne, David Petty. 294 [doi]
- SmartNIC-Based Distributed Shared MemoryHemanth Ramesh, Naarayanan Rao VSathish, Edson Horta, Antonio Barbalace, Binoy Ravindran. 295 [doi]
- iSEW: in-Sensor Embedded Watermarking for Secure ImagingSepehr Tabrizchi, Shaahin Angizi, Arman Roohi. 296 [doi]
- Efficient Adaptable Streaming Aggregation EnginePhilippos Papaphilippou, Wayne Luk, David Gregg. 297 [doi]
- Ph.D. Project ARIES: Efficient Mapping and Automated Compilation for AMD Versal DevicesJinming Zhuang, Peipei Zhou 0001. 298-299 [doi]
- Ph.D. Project: An Efficient NTT Accelerator Supporting Various Lengths for HHEHang Gu, Teng Wang, Chao Wang. 300-301 [doi]
- Ph.D. Project: Floorplan Quality Prediction for HLS Design Exploration on Multi-Die FPGAHaoran Xue, Teng Wang, Chao Wang. 302-303 [doi]
- Ph.D. Project: A Novel Compilation-Based Approach for Generating Sparse Tensor AcceleratorsXingyan Chen, Lei Gong, Chao Wang. 304-305 [doi]
- Ph.D. Project Hardware-Aware Neural NetworksMarta Andronic, George A. Constantinides. 306-307 [doi]
- Ph.D. Project: Holistic Partitioning and Optimization of CPU-FPGA Applications Through Source-to-Source CompilationTiago Santos, João Bispo, João M. P. Cardoso. 308-309 [doi]
- Ph.D. Project AIM: Accelerating Arbitrary-Precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAPZhuoping Yang, Peipei Zhou 0001. 310-311 [doi]