Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs

Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu. Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA, May 18-22, 2020. pages 440-450, IEEE, 2020. [doi]

Authors

Cheng Li

This author has not been identified. Look up 'Cheng Li' in Google

Abdul Dakkak

This author has not been identified. Look up 'Abdul Dakkak' in Google

Jinjun Xiong

This author has not been identified. Look up 'Jinjun Xiong' in Google

Wen-mei Hwu

This author has not been identified. Look up 'Wen-mei Hwu' in Google