Abstract is missing.
- NvMR: non-volatile memory renaming for intermittent computingAbhishek Bhattacharyya, Abhijith Somashekhar, Joshua San Miguel. 1-13 [doi]
- Free atomics: hardware atomic operations without fencesAshkan Asgharzadeh, Juan M. Cebrian, Arthur Perais, Stefanos Kaxiras, Alberto Ros. 14-26 [doi]
- Securing GPU via region-based bounds checkingJaewon Lee, Yonghae Kim, Jiashen Cao, Euna Kim, Jaekyu Lee, Hyesoon Kim. 27-41 [doi]
- täk¯: a polymorphic cache hierarchy for general-purpose optimization of data movementBrian C. Schwedock, Piratach Yoovidhya, Jennifer Seibert, Nathan Beckmann. 42-58 [doi]
- EQC: ensembled quantum computing for variational quantum algorithmsSamuel A. Stein, Nathan Wiebe, Yufei Ding, Bo Peng, Karol Kowalski, Nathan A. Baker, James Ang, Ang Li. 59-71 [doi]
- Axiomatic hardware-software contracts for securityNicholas Mosier, Hanna Lachnitt, Hamed Nemati, Caroline Trippel. 72-86 [doi]
- PPMLAC: high performance chipset architecture for secure multi-party computationXing Zhou, Zhilei Xu, Cong Wang, Mingyu Gao. 87-101 [doi]
- INSPIRE: in-storage private information retrieval via protocol and architecture co-designJilan Lin, Ling Liang, Zheng Qu, Ishtiyaque Ahmad, Liu Liu, Fengbin Tu, Trinabh Gupta, Yufei Ding, Yuan Xie. 102-115 [doi]
- TDGraph: a topology-driven accelerator for high-performance streaming graph processingJin Zhao 0003, Yun Yang, Yu Zhang, Xiaofei Liao, Lin Gu, Ligang He, Bingsheng He, Hai Jin 0001, Haikun Liu, Xinyu Jiang, Hui Yu. 116-129 [doi]
- DIMMining: pruning-efficient and parallel graph mining on near-memory-computingGuohao Dai, Zhenhua Zhu, Tianyu Fu 0004, Chiyue Wei, Bangyan Wang, Xiangyu Li, Yuan Xie, Huazhong Yang, Yu Wang 0002. 130-145 [doi]
- NDMiner: accelerating graph pattern mining using near data processingNishil Talati, Haojie Ye, Yichen Yang 0005, Leul Belayneh, Kuan-Yu Chen, David T. Blaauw, Trevor N. Mudge, Ronald G. Dreslinski. 146-159 [doi]
- SoftVN: efficient memory protection via software-provided version numbersMuhammad Umar, Weizhe Hua, Zhiru Zhang, G. Edward Suh. 160-172 [doi]
- CraterLake: a hardware accelerator for efficient unbounded computation on encrypted dataNikola Samardzic, Axel Feldmann, Aleksandar Krastev, Nathan Manohar, Nicholas Genise, Srinivas Devadas, Karim Eldefrawy, Chris Peikert, Daniel Sánchez 0003. 173-187 [doi]
- PS-ORAM: efficient crash consistency support for oblivious RAM on NVMGang Liu, Kenli Li 0001, Zheng Xiao, Rujia Wang. 188-203 [doi]
- There's always a bigger fish: a clarifying analysis of a machine-learning-assisted side-channel attackJack Cook, Jules Drean, Jonathan Behrens, Mengjia Yan. 204-217 [doi]
- Gearbox: a case for supporting accumulation dispatching and hybrid partitioning in PIM-based acceleratorsMarzieh Lenjani, Alif Ahmed, Mircea Stan, Kevin Skadron. 218-230 [doi]
- To PIM or not for emerging general purpose processing in DDR memory systemsAlexandar Devic, Siddhartha Balakrishna Rai, Anand Sivasubramaniam, Ameen Akel, Sean Eilert, Justin Eno. 231-244 [doi]
- MeNDA: a near-memory multi-way merge solution for sparse transposition and dataflowsSiying Feng, Xin He, Kuan-Yu Chen, Liu Ke, Xuan Zhang 0001, David T. Blaauw, Trevor N. Mudge, Ronald G. Dreslinski. 245-258 [doi]
- CaSMap: agile mapper for reconfigurable spatial architectures by automatically clustering intermediate representations and scattering mapping processXingchen Man, Jianfeng Zhu 0001, Guihuan Song, Shouyi Yin, Shaojun Wei, Leibo Liu. 259-273 [doi]
- FFCCD: fence-free crash-consistent concurrent defragmentation for persistent memoryYuanchao Xu 0001, Chencheng Ye, Yan Solihin, Xipeng Shen. 274-288 [doi]
- LightPC: hardware and software co-design for energy-efficient full system persistenceSangwon Lee, Miryeong Kwon, Gyuyoung Park, Myoungsoo Jung. 289-305 [doi]
- ASAP: architecture support for asynchronous persistenceAhmed H. M. O. Abulila, Izzat El Hajj, Myoungsoo Jung, Nam Sung Kim. 306-319 [doi]
- Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learningGagandeep Singh 0002, Rakesh Nadig, Jisung Park 0001, Rahul Bera, Nastaran Hajinazar, David Novo, Juan Gómez-Luna, Sander Stuijk, Henk Corporaal, Onur Mutlu. 320-336 [doi]
- A synthesis framework for stitching surface code with superconducting quantum devicesAnbang Wu, Gushu Li, Hezi Zhang, Gian Giacomo Guerreschi, Yufei Ding, Yuan Xie. 337-350 [doi]
- 2QAN: a quantum compiler for 2-local qubit hamiltonian simulation algorithmsLingling Lao, Dan E. Browne. 351-365 [doi]
- XQsim: modeling cross-technology control processors for 10+K qubit quantum computersIlkwon Byun, Junpyo Kim, Dongmoon Min, Ikki Nagaoka, Kosuke Fukumitsu, Iori Ishikawa, Teruo Tanimoto, Masamitsu Tanaka, Koji Inoue, Jangwoo Kim. 366-382 [doi]
- Geyser: a compilation framework for quantum computing with neutral atomsTirthak Patel, Daniel Silver, Devesh Tiwari. 383-395 [doi]
- X-cache: a modular architecture for domain-specific cachesAli Sedaghati, Milad Hakimi, Reza Hojabr, Arrvindh Shriraman. 396-409 [doi]
- Register file prefetchingSudhanshu Shukla, Sumeet Bandishte, Jayesh Gaur, Sreenivas Subramoney. 410-423 [doi]
- GCoM: a detailed GPU core model for accurate analytical modeling of modern GPUsJounghoo Lee, Yeonan Ha, Suhyun Lee, Jinyoung Woo, Jinho Lee, Hanhwi Jang, Youngsok Kim. 424-436 [doi]
- A scalable architecture for reprioritizing ordered parallelismGilead Posluns, Yan Zhu, Guowei Zhang, Mark C. Jeffrey. 437-453 [doi]
- Rethinking programmable earable processorsNathaniel Bleier, Muhammad Husnain Mubarik, Srijan Chakraborty, Shreyas Kishore, Rakesh Kumar. 454-467 [doi]
- uBrain: a unary brain computer interfaceDi Wu 0016, Jingjie Li, Zhewen Pan, Younghyun Kim 0001, Joshua San Miguel. 468-481 [doi]
- Managing reliability skew in DNA storageDehui Lin, Yasamin Tabatabaee, Yash Pote, Djordje Jevdjic. 482-494 [doi]
- EDAM: edit distance tolerant approximate matching content addressable memoryRobert Hanhan, Esteban Garzón, Zuher Jahshan, Adam Teman, Marco Lanuzza, Leonid Yavits. 495-507 [doi]
- Increasing ising machine capacity with multi-chip architecturesAnshujit Sharma, Richard Afoakwa, Zeljko Ignjatovic, Michael C. Huang 0001. 508-521 [doi]
- Cascading structured pruning: enabling high data reuse for sparse DNN acceleratorsEdward Hanson, Shiyu Li, Hai Helen Li, Yiran Chen. 522-535 [doi]
- Anticipating and eliminating redundant computations in accelerated sparse trainingJonathan S. Lew, Yunpeng Liu, Wenyi Gong, Negar Goli, R. David Evans, Tor M. Aamodt. 536-551 [doi]
- 2: a generalized matrix instruction set for accelerating tensor computation beyond GEMMYunan Zhang, Po-An Tsai, Hung-Wei Tseng 0001. 552-566 [doi]
- A software-defined tensor streaming multiprocessor for large-scale machine learningDennis Abts, Garrin Kimmell, Andrew C. Ling, John Kim, Matthew Boyd, Andrew Bitar, Sahil Parmar, Ibrahim Ahmed, Roberto DiCecco, David Han, John Thompson, Michael Bye, Jennifer Hwang, Jeremy Fowers, Peter Lillian, Ashwin Murthy, Elyas Mehtabuddin, Chetan Tekur, Thomas Sohmers, Kris Kang, Stephen Maresh, Jonathan Ross. 567-580 [doi]
- Themis: a network bandwidth-aware collective scheduling policy for distributed training of DL modelsSaeed Rashidi, William Won, Sudarshan Srinivasan, Srinivas Sridharan 0002, Tushar Krishna. 581-596 [doi]
- RACOD: algorithm/hardware co-design for mobile robot path planningMohammad Bakhshalipour, Seyed Borna Ehsani, Mohamad Qadri, Dominic Guri, Maxim Likhachev, Phillip B. Gibbons. 597-609 [doi]
- EyeCoD: eye tracking system acceleration via flatcam-based algorithm & accelerator co-designHaoran You, Cheng Wan, Yang Zhao, Zhongzhi Yu, Yonggan Fu, Jiayi Yuan, Shang Wu, Shunyao Zhang, Yongan Zhang, Chaojian Li, Vivek Boominathan, Ashok Veeraraghavan, Ziyun Li, Yingyan Lin. 610-622 [doi]
- Accelerating database analytic query workloads using an associative processorHelena Caminal, Yannis Chronis, Tianshu Wu, Jignesh M. Patel, José F. Martínez. 623-637 [doi]
- SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mappingDamla Senol Cali, Konstantinos Kanellopoulos, Joël Lindegger, Zülal Bingöl, Gurpreet S. Kalsi, Ziyi Zuo, Can Firtina, Meryem Banu Cavlak, Jeremie Kim, Nika Mansouri-Ghiasi, Gagandeep Singh 0002, Juan Gómez-Luna, Nour Almadhoun Alserr, Mohammed Alser, Sreenivas Subramoney, Can Alkan, Saugata Ghose, Onur Mutlu. 638-655 [doi]
- BioHD: an efficient genome sequence search platform using HyperDimensional memorizationZhuowen Zou, Hanning Chen, Prathyush Poduval, Yeseong Kim, Mahdi Imani, Elaheh Sadredini, Rosario Cammarota, Mohsen Imani. 656-669 [doi]
- MOESI-prime: preventing coherence-induced hammering in commodity workloadsKevin Loughlin, Stefan Saroiu, Alec Wolman, Yatin A. Manerkar, Baris Kasikci. 670-684 [doi]
- PACMAN: attacking ARM pointer authentication with speculative executionJoseph Ravichandran, Weon Taek Na, Jay Lang, Mengjia Yan. 685-698 [doi]
- Hydra: enabling low-overhead mitigation of row-hammer at ultra-low thresholds via hybrid trackingMoinuddin K. Qureshi, Aditya Rohan, Gururaj Saileshwar, Prashant J. Nair. 699-710 [doi]
- BTS: an accelerator for bootstrappable fully homomorphic encryptionSangpyo Kim, Jongmin Kim, Michael Jaemin Kim, Wonkyung Jung, John Kim, Minsoo Rhu, Jung Ho Ahn. 711-725 [doi]
- MGX: near-zero overhead memory protection for data-intensive acceleratorsWeizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh. 726-741 [doi]
- Thermometer: profile-guided btb replacement for data center applicationsShixin Song, Tanvir Ahmed Khan, Sara Mahdizadeh-Shahri, Akshitha Sriraman, Niranjan K. Soundararajan, Sreenivas Subramoney, Daniel A. Jiménez, Heiner Litz, Baris Kasikci. 742-756 [doi]
- Lukewarm serverless functions: characterization and optimizationDavid Schall, Artemiy Margaritov, Dmitrii Ustiugov, Andreas Sandberg, Boris Grot. 757-770 [doi]
- Dynamic global adaptive routing in high-radix networksHans Kasan, Gwangsun Kim, Yung Yi, John Kim. 771-783 [doi]
- ACT: designing sustainable computer systems with an architectural carbon modeling toolUdit Gupta, Mariam Elgamal, Gage Hills, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks 0001, Carole-Jean Wu. 784-799 [doi]
- HiveMind: a hardware-software system stack for serverless edge swarmsLiam Patterson, David Pigorovsky, Brian Dempsey, Nikita Lazarev, Aditya Shah, Clara Steinhoff, Ariana Bruno, Justin Hu, Christina Delimitrou. 800-816 [doi]
- Tiny but mighty: designing and realizing scalable latency tolerance for manycore SoCsMarcelo Orenes-Vera, Aninda Manocha, Jonathan Balkind, Fei Gao 0016, Juan L. Aragón, David Wentzlaff, Margaret Martonosi. 817-830 [doi]
- FlexiCores: low footprint, high yield, field reprogrammable flexible microprocessorsNathaniel Bleier, Calvin Lee 0004, Francisco Rodriguez, Antony Sou, Scott White, Rakesh Kumar. 831-846 [doi]
- SNS's not a synthesizer: a deep-learning-based synthesis predictorCeyu Xu, Chris Kjellqvist, Lisa Wu Wills. 847-859 [doi]
- Training personalized recommendation systems from (GPU) scratch: look forward not backwardsYoungeun Kwon, Minsoo Rhu. 860-873 [doi]
- AMOS: enabling automatic mapping for tensor computations on spatial accelerators with hardware abstractionSize Zheng 0001, Renze Chen, Anjiang Wei, Yicheng Jin, Qin Han, Liqiang Lu, Bingyang Wu, Xiuhong Li, Shengen Yan, Yun Liang 0001. 874-887 [doi]
- Mokey: enabling narrow fixed-point inference for out-of-the-box floating-point transformer modelsAli Hadi Zadeh, Mostafa Mahmoud, Ameer Abdelhadi, Andreas Moshovos. 888-901 [doi]
- Accelerating attention through gradient-based learned runtime pruningZheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang. 902-915 [doi]
- Graphite: optimizing graph neural networks on CPUs through cooperative software-hardware techniquesZhangxiaowen Gong, Houxiang Ji, Yao Yao, Christopher W. Fletcher, Christopher J. Hughes, Josep Torrellas. 916-931 [doi]
- SmartSAGE: training large-scale graph neural networks using in-storage processing architecturesYunjae Lee, Jinha Chung, Minsoo Rhu. 932-945 [doi]
- Hyperscale FPGA-as-a-service architecture for large-scale distributed graph neural networkShuangchen Li, Dimin Niu, Yuhao Wang, Wei Han, Zhe Zhang, Tianchan Guan, Yijin Guan, Heng Liu, Linyong Huang, Zhaoyang Du, Fei Xue, Yuanwei Fang, Hongzhong Zheng, Yuan Xie. 946-961 [doi]
- Crescent: taming memory irregularities for accelerating deep point cloud analyticsYu Feng 0007, Gunnar Hammonds, Yiming Gan, Yuhao Zhu 0001. 962-977 [doi]
- The Mozart reuse exposed dataflow processor for AI and beyond: industrial productKarthikeyan Sankaralingam, Tony Nowatzki, Vinay Gangadhar, Preyas Shah, Michael Davies, William Galliher, Ziliang Guo, Jitu Khare, Deepak Vijay, Poly Palamuttam, Maghawan Punde, Alex Tan, Vijay Thiruvengadam, Rongyi Wang, Shunmiao Xu. 978-992 [doi]
- Software-hardware co-design for fast and scalable training of deep learning recommendation modelsDheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan 0002, Xing Liu, Mustafa Ozdal, Jade Nie, JongSoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang 0020, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, K. R. Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, Vijay Rao. 993-1011 [doi]
- AI accelerator on IBM telum processor: industrial productCédric Lichtenau, Alper Buyuktosunoglu, Ramon Bertran, Peter Figuli, Christian Jacobi 0002, Nikolaos Papandreou, Haris Pozidis, Anthony Saporito, Andrew Sica, Elpida Tzortzatos. 1012-1028 [doi]
- Fidas: fortifying the cloud via comprehensive FPGA-based offloading for intrusion detection: industrial productJian Chen, Xiaoyu Zhang, Tao Wang, Ying Zhang, Tao Chen, Jiajun Chen, Mingxu Xie, Qiang Liu. 1029-1041 [doi]
- Understanding data storage and ingestion for large-scale deep recommendation model training: industrial productMark Zhao, Niket Agarwal, Aarti Basant, Bugra Gedik, Satadru Pan, Mustafa Ozdal, Rakesh Komuravelli, Jerry Pan, Tianshu Bao, Haowei Lu, Sundaram Narayanan, Jack Langman, Kevin Wilfong, Harsha Rastogi, Carole-Jean Wu, Christos Kozyrakis, Parik Pol. 1042-1057 [doi]
- Mixed-proxy extensions for the NVIDIA PTX memory consistency model: industrial productDaniel Lustig, Simon Cooksey, Olivier Giroux. 1058-1070 [doi]