TrainBF: High-Performance DNN Training Engine Using BFloat16 on AI Accelerators

Zhen Xie, Siddhisanket Raskar, Murali Emani, Venkatram Vishwanath. TrainBF: High-Performance DNN Training Engine Using BFloat16 on AI Accelerators. In José Cano 0001, Marios D. Dikaiakos, George A. Papadopoulos, Miquel Pericàs, Rizos Sakellariou, editors, Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28 - September 1, 2023, Proceedings. Volume 14100 of Lecture Notes in Computer Science, pages 458-473, Springer, 2023. [doi]

Authors

Zhen Xie

This author has not been identified. Look up 'Zhen Xie' in Google

Siddhisanket Raskar

This author has not been identified. Look up 'Siddhisanket Raskar' in Google

Murali Emani

This author has not been identified. Look up 'Murali Emani' in Google

Venkatram Vishwanath

This author has not been identified. Look up 'Venkatram Vishwanath' in Google