Training of deep learning pipelines on memory-constrained GPUs via segmented fused-tiled execution

Yufan Xu, Saurabh Raje, Atanas Rountev, Gerald Sabin, Aravind Sukumaran-Rajam, P. Sadayappan. Training of deep learning pipelines on memory-constrained GPUs via segmented fused-tiled execution. In Bernhard Egger, Aaron Smith, editors, CC '22: 31st ACM SIGPLAN International Conference on Compiler Construction, Seoul, South Korea, April 2 - 3, 2022. pages 104-116, ACM, 2022. [doi]

@inproceedings{XuRRSSS22,
  title = {Training of deep learning pipelines on memory-constrained GPUs via segmented fused-tiled execution},
  author = {Yufan Xu and Saurabh Raje and Atanas Rountev and Gerald Sabin and Aravind Sukumaran-Rajam and P. Sadayappan},
  year = {2022},
  doi = {10.1145/3497776.3517766},
  url = {https://doi.org/10.1145/3497776.3517766},
  researchr = {https://researchr.org/publication/XuRRSSS22},
  cites = {0},
  citedby = {0},
  pages = {104-116},
  booktitle = {CC '22: 31st ACM SIGPLAN International Conference on Compiler Construction, Seoul, South Korea, April 2 - 3, 2022},
  editor = {Bernhard Egger and Aaron Smith},
  publisher = {ACM},
  isbn = {978-1-4503-9183-2},
}