DGEMM using FP64 Arithmetic Emulation and FP8 Tensor Cores with Ozaki Scheme

Daichi Mukunoki. DGEMM using FP64 Arithmetic Emulation and FP8 Tensor Cores with Ozaki Scheme. In Proceedings of the Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Workshops, SCA/HPCAsiaWS 2026, Osaka, Japan, January 26-29, 2026. pages 303-311, ACM, 2026. [doi]

Abstract

Abstract is missing.