TrainBF: High-Performance DNN Training Engine Using BFloat16 on AI Accelerators

Zhen Xie, Siddhisanket Raskar, Murali Emani, Venkatram Vishwanath. TrainBF: High-Performance DNN Training Engine Using BFloat16 on AI Accelerators. In José Cano 0001, Marios D. Dikaiakos, George A. Papadopoulos, Miquel Pericàs, Rizos Sakellariou, editors, Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28 - September 1, 2023, Proceedings. Volume 14100 of Lecture Notes in Computer Science, pages 458-473, Springer, 2023. [doi]

Abstract

Abstract is missing.