Auto-tuning Dense Matrix Multiplication for GPGPU with Cache

Xiang Cui, Yifeng Chen, Changyou Zhang, Hong Mei. Auto-tuning Dense Matrix Multiplication for GPGPU with Cache. In IEEE 16th International Conference on Parallel and Distributed Systems, ICPADS 2010, 8-10 Dec. 2010, Shanghai, China. pages 237-242, IEEE, 2010. [doi]

@inproceedings{CuiCZM10,
  title = {Auto-tuning Dense Matrix Multiplication for GPGPU with Cache},
  author = {Xiang Cui and Yifeng Chen and Changyou Zhang and Hong Mei},
  year = {2010},
  doi = {10.1109/ICPADS.2010.64},
  url = {http://dx.doi.org/10.1109/ICPADS.2010.64},
  tags = {caching},
  researchr = {https://researchr.org/publication/CuiCZM10},
  cites = {0},
  citedby = {0},
  pages = {237-242},
  booktitle = {IEEE 16th International Conference on Parallel and Distributed Systems, ICPADS 2010, 8-10 Dec. 2010, Shanghai, China},
  publisher = {IEEE},
}