Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs

Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu. Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA, May 18-22, 2020. pages 440-450, IEEE, 2020. [doi]

@inproceedings{LiDXH20-0,
  title = {Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs},
  author = {Cheng Li and Abdul Dakkak and Jinjun Xiong and Wen-mei Hwu},
  year = {2020},
  doi = {10.1109/IPDPS47924.2020.00053},
  url = {https://doi.org/10.1109/IPDPS47924.2020.00053},
  researchr = {https://researchr.org/publication/LiDXH20-0},
  cites = {0},
  citedby = {0},
  pages = {440-450},
  booktitle = {2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA, May 18-22, 2020},
  publisher = {IEEE},
  isbn = {978-1-7281-6876-0},
}