Rathinakumar Appuswamy, Michael V. DeBole, Brian Taba, Steven K. Esser, Andrew S. Cassidy, Arnon Amir, Alexander Andreopoulos, Deepika Bablani, Pallab Datta, Jeffrey A. Kusnitz, Nathaniel J. McClatchey, Neil McGlohon, Jeffrey L. McKinstry, Tapan K. Nayak, Daniel F. Smith, Rafael Sousa, Ignacio Terrizzano, Filipp Akopyan, Peter J. Carlson, Rajamohan Gandhasri, Guillaume Garreau, Nelson M. Gonzalez, Megumi Ito, Jennifer L. Klamo, Yutaka Y. Nakamura, Carlos Ortega-Otero, William P. Risk, Jun Sawada, Kai Schleupen, Jay Sivagnaname, Matthew Stallone, Takanori Ueda, Myron D. Flickner, John V. Arthur, Rameswar Panda, David D. Cox, Dharmendra S. Modha. Breakthrough Low-Latency, High-Energy-Efficiency LLM Inference Performance Using NorthPole. In IEEE High Performance Extreme Computing Conference, HPEC 2024, Wakefield, MA, USA, September 23-27, 2024. pages 1-8, IEEE, 2024. [doi]
Abstract is missing.