Implementing a Code Generator for Fast Matrix Multiplication in OpenCL on the GPU

Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin. Implementing a Code Generator for Fast Matrix Multiplication in OpenCL on the GPU. In IEEE 6th International Symposium on Embedded Multicore/Manycore SoCs, MCSoC 2012, Fukushima, Japan, September 20-22, 2012. pages 198-204, IEEE Computer Society, 2012. [doi]

@inproceedings{MatsumotoNS12-1,
  title = {Implementing a Code Generator for Fast Matrix Multiplication in OpenCL on the GPU},
  author = {Kazuya Matsumoto and Naohito Nakasato and Stanislav G. Sedukhin},
  year = {2012},
  doi = {10.1109/MCSoC.2012.30},
  url = {https://doi.org/10.1109/MCSoC.2012.30},
  researchr = {https://researchr.org/publication/MatsumotoNS12-1},
  cites = {0},
  citedby = {0},
  pages = {198-204},
  booktitle = {IEEE 6th International Symposium on Embedded Multicore/Manycore SoCs, MCSoC 2012, Fukushima, Japan, September 20-22, 2012},
  publisher = {IEEE Computer Society},
  isbn = {978-1-4673-2535-6},
}