Abstract is missing.
- DRAM Translation Layer: Software-Transparent DRAM Power Savings for Disaggregated MemoryWenjing Jin 0001, Wonsuk Jang, Haneul Park, Jongsung Lee 0001, Soosung Kim 0001, Jae W. Lee. [doi]
- LeCA: In-Sensor Learned Compressive Acquisition for Efficient Machine Vision on the EdgeTianrui Ma, Adith Jagadish Boloor, Xiangxing Yang, Weidong Cao, Patrick Williams, Nan Sun, Ayan Chakrabarti, Xuan Zhang. [doi]
- Astrea: Accurate Quantum Error-Decoding via Practical Minimum-Weight Perfect-MatchingSuhas Vittal, Poulami Das 0005, Moinuddin K. Qureshi. [doi]
- EdgePC: Efficient Deep Learning Analytics for Point Clouds on Edge DevicesZiyu Ying 0001, Sandeepa Bhuyan, Yan Kang, Yingtian Zhang, Mahmut T. Kandemir, Chita R. Das. [doi]
- Scaling Qubit Readout with Hardware Efficient Machine Learning ArchitecturesSatvik Maurya, Chaithanya Naik Mude, William D. Oliver, Benjamin Lienhard, Swamit S. Tannu. [doi]
- MTIA: First Generation Silicon Targeting Meta's Recommendation SystemsAmin Firoozshahian, Joel Coburn, Roman Levenstein, Rakesh Nattoji, Ashwin Kamath, Olívia Wu, Gurdeepak Grewal, Harish Aepala, Bhasker Jakka, Bob Dreyer, Adam Hutchin, Utku Diril, Krishnakumar Nair, Ehsan K. Ardestani, Martin Schatz, Yuchen Hao, Rakesh Komuravelli, Kunming Ho, Sameer Abu Asal, Joe Shajrawi, Kevin Quinn, Nagesh Sreedhara, Pankaj Kansal, Willie Wei, Dheepak Jayaraman, Linda Cheng, Pritam Chopda, Eric Wang, Ajay Bikumandla, Arun Karthik Sengottuvel, Krishna Thottempudi, Ashwin Narasimha, Brian Dodds, Cao Gao, Jiyuan Zhang, Mohammed Al-Sanabani, Ana Zehtabioskuie, Jordan Fix, Hangchen Yu, Richard Li, Kaustubh Gondkar, Jack Montgomery, Mike Tsai, Saritha Dwarakapuram, Sanjay Desai, Nili Avidan, Poorvaja Ramani, Karthik Narayanan, Ajit Mathews, Sethu Gopal, Maxim Naumov, Vijay Rao, Krishna Noru, Harikrishna Reddy, Prahlad Venkatapuram, Alexis Bjorlin. [doi]
- ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture DesignSrivatsan Krishnan, Amir Yazdanbakhsh, Shvetank Prakash, Jason Jabbour, Ikechukwu Uchendu, Susobhan Ghosh, Behzad Boroujerdian, Daniel Richins, Devashree Tripathy, Aleksandra Faust, Vijay Janapa Reddi. [doi]
- SAC: Sharing-Aware Caching in Multi-Chip GPUsShiqing Zhang, Mahmood Naderan-Tahan, Magnus Jahre, Lieven Eeckhout. [doi]
- Enabling High Performance Debugging for Variational Quantum Algorithms using Compressed SensingTianyi Hao 0003, Kun Liu, Swamit Tannu. [doi]
- ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme ClassificationSiqi Li, Fengbin Tu, Liu Liu 0017, Jilan Lin, Zheng Wang, Yangwook Kang, Yufei Ding, Yuan Xie 0001. [doi]
- LAORAM: A Look Ahead ORAM Architecture for Training Large Embedding TablesRachit Rajat, Yongqin Wang, Murali Annavaram. [doi]
- EMISSARY: Enhanced Miss Awareness Replacement Policy for L2 Instruction CachingNayana Prasad Nagendra, Bhargav Reddy Godala, Ishita Chaturvedi, Atmn Patel, Svilen Kanev, Tipp Moseley, Jared Stark, Gilles A. Pokam, Simone Campanoni, David I. August. [doi]
- On Endurance of Processing in (Nonvolatile) MemorySalonik Resch, M. Hüsrev Cilasun, Zamshed I. Chowdhury, Masoud Zabihi, Zhengyang Zhao, Jian-Ping Wang 0006, Sachin S. Sapatnekar, Ulya R. Karpuzcu. [doi]
- Metior: A Comprehensive Model to Evaluate Obfuscating Side-Channel Defense SchemesPeter W. Deutsch, Weon Taek Na, Thomas Bourgeat, Joel S. Emer, Mengjia Yan 0001. [doi]
- V10: Hardware-Assisted NPU Multi-tenancy for Improved Resource Utilization and FairnessYuqi Xue, Yiqi Liu, Lifeng Nai, Jian Huang 0006. [doi]
- SPADE: A Flexible and Scalable Accelerator for SpMM and SDDMMGerasimos Gerogiannis, Serif Yesil, Damitha Lenadora, Dingyuan Cao, Charith Mendis, Josep Torrellas. [doi]
- OneQ: A Compilation Framework for Photonic One-Way Quantum ComputationHezi Zhang, Anbang Wu, Yuke Wang, Gushu Li, Hassan Shapourian, Alireza Shabani, Yufei Ding. [doi]
- TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for EmbeddingsNorman P. Jouppi, George Kurian, Sheng Li 0007, Peter C. Ma, Rahul Nagarajan, Lifeng Nai, Nishant Patil, Suvinay Subramanian, Andy Swing, Brian Towles, Clif Young, Xiang Zhou, Zongwei Zhou, David A Patterson. [doi]
- All Your PC Are Belong to Us: Exploiting Non-control-Transfer Instruction BTB Updates for Dynamic PC ExtractionJiyong Yu, Trent Jaeger, Christopher Wardlaw Fletcher. [doi]
- RoboShape: Using Topology Patterns to Scalably and Flexibly Deploy Accelerators Across RobotsSabrina M. Neuman, Radhika Ghosal, Thomas Bourgeat, Brian Plancher, Vijay Janapa Reddi. [doi]
- R2D2: Removing ReDunDancy Utilizing Linearity of Address Generation in GPUsDongho Ha, Yunho Oh, Won Woo Ro. [doi]
- RoSÉ: A Hardware-Software Co-Simulation Infrastructure Enabling Pre-Silicon Full-Stack Robotics SoC EvaluationDima Nikiforov, Shengjun Chris Dong, Chengyi Lux Zhang, Seah Kim, Borivoje Nikolic, Yakun Sophia Shao. [doi]
- Clifford-based Circuit Cutting for Quantum SimulationKaitlin N. Smith, Michael A. Perlin, Pranav Gokhale, Paige Frederick, David Owusu-Antwi, Richard Rines, Victory Omole, Frederic T. Chong. [doi]
- Supply Chain Aware Computer ArchitectureAugust Ning, Georgios Tziantzioulis, David Wentzlaff. [doi]
- F4T: A Fast and Flexible FPGA-based Full-stack TCP Acceleration FrameworkJunehyuk Boo, Yujin Chung, Eunjin Baek, Seongmin Na, Changsu Kim, Jangwoo Kim. [doi]
- MXFaaS: Resource Sharing in Serverless Environments for Parallelism and EfficiencyJovan Stojkovic, Tianyin Xu, Hubertus Franke, Josep Torrellas. [doi]
- Architecting Efficient Multi-modal AIoT SystemsXiaofeng Hou, Jiacheng Liu, Xuehan Tang, Chao Li, Jia Chen, Luhong Liang, Kwang-Ting Cheng, Minyi Guo. [doi]
- TaskFusion: An Efficient Transfer Learning Architecture with Dual Delta Sparsity for Multi-Task Natural Language ProcessingZichen Fan, Qirui Zhang 0001, Pierre Abillama, Sara Shoouri, Changwoo Lee, David T. Blaauw, Hun-Seok Kim, Dennis Sylvester. [doi]
- OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair QuantizationCong Guo 0003, Jiaming Tang, Weiming Hu, Jingwen Leng, Chen Zhang 0001, Fan Yang 0024, Yunxin Liu, Minyi Guo, Yuhao Zhu 0001. [doi]
- Hardware Acceleration of Neural GraphicsMuhammad Husnain Mubarik, Ramakrishna Kanungo, Tobias Zirr, Rakesh Kumar 0002. [doi]
- Q-BEEP: Quantum Bayesian Error Mitigation Employing Poisson Modeling over the Hamming SpectrumSamuel Alexander Stein, Nathan Wiebe, Yufei Ding, James Ang, Ang Li. [doi]
- Programmable Olfactory ComputingNathaniel Bleier, Abigail Wezelis, Lav Varshney, Rakesh Kumar 0002. [doi]
- Imprecise Store ExceptionsSiddharth Gupta 0003, Yuanlong Li, Qingxuan Kang, Abhishek Bhattacharjee, Babak Falsafi, Yunho Oh, Mathias Payer. [doi]
- RAELLA: Reforming the Arithmetic for Efficient, Low-Resolution, and Low-Loss Analog PIM: No Retraining Required!Tanner Andrulis, Joel S. Emer, Vivienne Sze. [doi]
- Optimizing CPU Performance for Recommendation Systems At-ScaleRishabh Jain, Scott Cheng, Vishwas Kalagi, Vrushabh Sanghavi, Samvit Kaul, Meena Arunachalam, Kiwan Maeng, Adwait Jog, Anand Sivasubramaniam, Mahmut Taylan Kandemir, Chita R. Das. [doi]
- Understanding and Mitigating Hardware Failures in Deep Learning Training SystemsYi He 0010, Mike Hutton, Steven Chan, Robert De Gruijl, Rama Govindaraju, Nishant Patil, Yanjing Li. [doi]
- Dancing the Quantum Waltz: Compiling Three-Qubit Gates on Four Level ArchitecturesAndrew Litteken, Lennart Maximilian Seifert, Jason D. Chadwick, Natalia Nottingham, Tanay Roy, Ziqian Li, David I. Schuster, Frederic T. Chong, Jonathan M. Baker. [doi]
- TEA: Time-Proportional Event AnalysisBjörn Gottschall, Lieven Eeckhout, Magnus Jahre. [doi]
- Implicit Memory Tagging: No-Overhead Memory Safety Using Alias-Free Tagged ECCMichael B. Sullivan 0001, Mohamed Tarek Ibn Ziad, Aamer Jaleel, Stephen W. Keckler. [doi]
- With Shared Microexponents, A Little Shifting Goes a Long WayBita Darvish Rouhani, Ritchie Zhao, Venmugil Elango, Rasoul Shafipour, Mathew Hall, Maral Mesmakhosroshahi, Ankit More, Levi Melnick, Maximilian Golub, Girish Varatkar, Lai Shao, Gaurav Kolhe, Dimitry Melts, Jasmine Klar, Renee L'Heureux, Matt Perry, Doug Burger, Eric S. Chung, Zhaoxia (Summer) Deng, Sam Naghshineh, JongSoo Park, Maxim Naumov. [doi]
- Parallel Driving for Fast Quantum Computing Under Speed LimitsEvan McKinney, Chao Zhou, Mingkang Xia, Michael Hatridge, Alex K. Jones. [doi]
- MESA: Microarchitecture Extensions for Spatial Architecture GenerationDong Kai Wang, Jiaqi Lou, Naiyin Jin, Edwin Mascarenhas, Rohan Mahapatra, Sean Kinzer, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Nam Sung Kim. [doi]
- CAMJ: Enabling System-Level Energy Modeling and Architectural Exploration for In-Sensor Visual ComputingTianrui Ma, Yu Feng 0007, Xuan Zhang, Yuhao Zhu 0001. [doi]
- μManycore: A Cloud-Native CPU for Tail at ScaleJovan Stojkovic, Chunao Liu, Muhammad Shahbaz, Josep Torrellas. [doi]
- Instant-3D: Instant Neural Radiance Field Training Towards On-Device AR/VR 3D ReconstructionSixu Li, Chaojian Li, Wenbo Zhu, Boyang Tony Yu, Yang Katie Zhao, Cheng Wan, Haoran You, Huihong Shi, Yingyan Celine Lin. [doi]
- Write-Light Cache for Energy Harvesting SystemsJongouk Choi, Jianping Zeng 0001, Dongyoon Lee, Changwoo Min, Changhee Jung. [doi]
- Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free AccessesRakesh Nadig, Mohammad Sadrosadati, Haiyu Mao, Nika Mansouri-Ghiasi, Arash Tavakkol, Jisung Park 0001, Hamid Sarbazi-Azad, Juan Gómez-Luna, Onur Mutlu. [doi]
- SCALO: An Accelerator-Rich Distributed System for Scalable Brain-Computer InterfacingKarthik Sriram, Raghavendra Pradyumna Pothukuchi, Michal Gerasimiuk, Muhammed Ugur, Oliver Ye, Rajit Manohar, Anurag Khandelwal, Abhishek Bhattacharjee. [doi]
- Pensieve: Microarchitectural Modeling for Security EvaluationYuheng Yang, Thomas Bourgeat, Stella Lau, Mengjia Yan 0001. [doi]
- QIsim: Architecting 10+K Qubit QC Interfaces Toward Quantum SupremacyDongmoon Min, Junpyo Kim, Junhyuk Choi, Ilkwon Byun, Masamitsu Tanaka, Koji Inoue, Jangwoo Kim. [doi]
- RowPress: Amplifying Read Disturbance in Modern DRAM ChipsHaocong Luo, Ataberk Olgun, Abdullah Giray Yaglikçi, Yahya Can Tugrul, Steve Rhyner, Meryem Banu Cavlak, Joël Lindegger, Mohammad Sadrosadati, Onur Mutlu. [doi]
- FDMAX: An Elastic Accelerator Architecture for Solving Partial Differential EquationsJiajun Li, Yuxuan Zhang, Hao Zheng 0005, Ke Wang 0030. [doi]
- Energy-Efficient Realtime Motion PlanningDeval Shah, Ningfeng Yang, Tor M. Aamodt. [doi]
- An Algorithm and Architecture Co-design for Accelerating Smart Contracts in BlockchainRui Pan, Chubo Liu, Guoqing Xiao, Mingxing Duan, Keqin Li 0001, Kenli Li 0001. [doi]
- ImaGen: A General Framework for Generating Memory- and Power-Efficient Image Processing AcceleratorsNisarg Ujjainkar, Jingwen Leng, Yuhao Zhu 0001. [doi]
- A Research Retrospective on AMD's Exascale Computing JourneyGabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Vignesh Adhinarayanan, Shaizeen Aga, Derrick Aguren, Varun Agrawal, Ashwin M. Aji, Johnathan Alsop, Paul T. Bauman, Bradford M. Beckmann, Majed Valad Beigi, Sergey Blagodurov, Travis Boraten, Michael Boyer, William C. Brantley, Noel Chalmers, Shaoming Chen, Kevin Cheng, Michael L. Chu, David Cownie, Nicholas Curtis, Joris Del Pino, Nam Duong, Alexandru Dutu, Yasuko Eckert, Christopher Erb, Chip Freitag, Joseph L. Greathouse, Sudhanva Gurumurthi, Anthony Gutierrez, Khaled Hamidouche, Sachin Hossamani, Wei Huang 0004, Mahzabeen Islam, Nuwan Jayasena, John Kalamatianos, Onur Kayiran, Jagadish Kotra, Alan Lee, Daniel Lowell, Niti Madan, Abhinandan Majumdar, Nicholas Malaya, Srilatha Manne, Susumu Mashimo, Damon McDougall, Elliot Mednick, Michael Mishkin, Mark Nutter, Indrani Paul, Matthew Poremba, Brandon Potter, Kishore Punniyamurthy, Sooraj Puthoor, Steven E. Raasch, Karthik Rao, Gregory Rodgers, Marko Scrbak, Mohammad Seyedzadeh, John Slice, Vilas Sridharan, René van Oostrum, Eric Van Tassell, Abhinav Vishnu, Samuel Wasmundt, Mark Wilkening, Noah Wolfe, Mark Wyse, Adithya Yalavarti, Dmitri Yudanov. [doi]
- SHARP: A Short-Word Hierarchical Accelerator for Robust and Practical Fully Homomorphic EncryptionJongmin Kim 0007, Sangpyo Kim, Jaewan Choi, Jaiyoung Park, Donghwan Kim, Jung Ho Ahn. [doi]
- DynAMO: Improving Parallelism Through Dynamic Placement of Atomic Memory OperationsVíctor Soria Pardos, Adrià Armejach, Tiago Mück, Darío Suárez Gracia, José A. Joao, Alejandro Rico, Miquel Moretó. [doi]
- RSQP: Problem-specific Architectural Customization for Accelerated Convex Quadratic OptimizationMaolin Wang, Ian McInerney, Bartolomeo Stellato, Stephen P. Boyd, Hayden Kwok-Hay So. [doi]
- CDPU: Co-designing Compression and Decompression Processing Units for Hyperscale SystemsSagar Karandikar, Aniruddha N. Udipi, Junsun Choi, Joonho Whangbo, Jerry Zhao, Svilen Kanev, Edwin Lim, Jyrki Alakuijala, Vrishab Madduri, Yakun Sophia Shao, Borivoje Nikolic, Krste Asanovic, Parthasarathy Ranganathan. [doi]
- Profiling Hyperscale Big Data ProcessingAbraham Gonzalez, Aasheesh Kolli, Samira Manabi Khan, Sihang Liu 0001, Vidushi Dadu, Sagar Karandikar, Jichuan Chang, Krste Asanovic, Parthasarathy Ranganathan. [doi]
- Flumen: Dynamic Processing in the Photonic InterconnectKyle Shiflett, Avinash Karanth, Razvan C. Bunescu, Ahmed Louri. [doi]
- Accelerating Personalized Recommendation with Cross-level Near-Memory ProcessingHaifeng Liu, Long Zheng 0003, Yu Huang 0013, Chaoqiang Liu, Xiangyu Ye, Jingrui Yuan, Xiaofei Liao, Hai Jin 0001, Jingling Xue. [doi]
- Doppelganger Loads: A Safe, Complexity-Effective Optimization for Secure Speculation SchemesAmund Bergland Kvalsvik, Pavlos Aimoniotis, Stefanos Kaxiras, Magnus Själander. [doi]
- Contiguitas: The Pursuit of Physical Memory Contiguity in DatacentersKaiyang Zhao, Kaiwen Xue, Ziqi Wang, Dan Schatzberg, Leon Yang, Antonis Manousis, Johannes Weiner, Rik van Riel, Bikash Sharma, Chunqiang Tang, Dimitrios Skarlatos 0002. [doi]
- TEESec: Pre-Silicon Vulnerability Discovery for Trusted Execution EnvironmentsMoein Ghaniyoun, Kristin Barber, Yuan Xiao 0001, Yinqian Zhang, Radu Teodorescu. [doi]
- SmartDS: Middle-Tier-centric SmartNIC Enabling Application-aware Message Split for Disaggregated Block StorageJie Zhang, Hongjing Huang, Lingjun Zhu, Shu Ma, Dazhong Rong, Yijun Hou, Mo Sun, Chaojie Gu, Peng Cheng, Chao Shi, Zeke Wang. [doi]
- GenDP: A Framework of Dynamic Programming Acceleration for Genome Sequencing AnalysisYufeng Gu, Arun Subramaniyan 0001, Timothy Dunn, Alireza Khadem, Kuan-Yu Chen, Somnath Paul, Md. Vasimuddin, Sanchit Misra, David T. Blaauw, Satish Narayanasamy, Reetuparna Das. [doi]
- Orinoco: Ordered Issue and Unordered Commit with Non-Collapsible QueuesDibei Chen, Tairan Zhang, Yi Huang, Jianfeng Zhu 0001, Yang Liu, Pengfei Gou, Chunyang Feng, Binghua Li, Shaojun Wei, Leibo Liu. [doi]
- HAAC: A Hardware-Software Co-Design to Accelerate Garbled CircuitsJianqiao Mo, Jayanth Gopinath, Brandon Reagen. [doi]
- NeuRex: A Case for Neural Rendering AccelerationJunseo Lee, Kwanseok Choi, Jungi Lee, Seokwon Lee, Joonho Whangbo, Jaewoong Sim. [doi]
- Inter-layer Scheduling Space Definition and Exploration for Tiled AcceleratorsJingwei Cai, Yuchen Wei, Zuotong Wu, Sen Peng, Kaisheng Ma. [doi]
- Nimblock: Scheduling for Fine-grained FPGA Sharing through VirtualizationMeghna Mandava, Paul Reckamp, Deming Chen. [doi]
- K-D Bonsai: ISA-Extensions to Compress K-D Trees for Autonomous Driving TasksPedro Henrique Exenberger Becker, José María Arnau, Antonio González 0001. [doi]
- ETTE: Efficient Tensor-Train-based Computing Engine for Deep Neural NetworksYu Gong, Miao Yin, Lingyi Huang, Jinqi Xiao, Yang Sui, Chunhua Deng, Bo Yuan 0001. [doi]
- MetaNMP: Leveraging Cartesian-Like Product to Accelerate HGNNs with Near-Memory ProcessingDan Chen, Haiheng He, Hai Jin 0001, Long Zheng 0003, Yu Huang 0013, Xinyang Shen, Xiaofei Liao. [doi]
- Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-DesignYonggan Fu, Zhifan Ye, Jiayi Yuan, Shunyao Zhang, Sixu Li, Haoran You, Yingyan Lin. [doi]
- Shogun: A Task Scheduling Framework for Graph Mining AcceleratorsYibo Wu, Jianfeng Zhu 0001, Wenrui Wei, Longlong Chen, Liang Wang, Shaojun Wei, Leibo Liu. [doi]
- FACT: FFN-Attention Co-optimized Transformer Architecture with Eager Correlation PredictionYubin Qin, Yang Wang, Dazheng Deng, Zhiren Zhao, Xiaolong Yang, Leibo Liu, Shaojun Wei, Yang Hu 0001, Shouyi Yin. [doi]
- Mystique: Enabling Accurate and Scalable Generation of Production AI BenchmarksMingyu Liang, Wenyin Fu, Louis Feng, Zhongyi Lin, Pavani Panakanti, Shengbao Zheng, Srinivas Sridharan, Christina Delimitrou. [doi]
- ISA-Grid: Architecture of Fine-grained Privilege Control for Instructions and RegistersShulin Fan, Zhichao Hua 0001, Yubin Xia, Haibo Chen 0001, Binyu Zang. [doi]
- Spy in the GPU-box: Covert and Side Channel Attacks on Multi-GPU SystemsSankha Baran Dutta, Hoda Naghibijouybari, Arjun Gupta, Nael B. Abu-Ghazaleh, Andres Marquez, Kevin J. Barker. [doi]
- MapZero: Mapping for Coarse-grained Reconfigurable Architectures with Reinforcement Learning and Monte-Carlo Tree SearchXiangyu Kong, Yi Huang, Jianfeng Zhu, Xingchen Man, Yang Liu, Chunyang Feng, Pengfei Gou, Minggui Tang, Shaojun Wei, Leibo Liu. [doi]
- Decoupled SSD: Rethinking SSD Architecture through Network-based Flash ControllersJiho Kim, Myoungsoo Jung, John Kim. [doi]