FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models

Jiaao He, Jidong Zhai, Tiago Antunes, Haojie Wang, Fuwen Luo, Shangfeng Shi, Qin Li. FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models. In Jaejin Lee, Kunal Agrawal, Michael F. Spear, editors, PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2 - 6, 2022. pages 120-134, ACM, 2022. [doi]

@inproceedings{HeZAWLSL22,
  title = {FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models},
  author = {Jiaao He and Jidong Zhai and Tiago Antunes and Haojie Wang and Fuwen Luo and Shangfeng Shi and Qin Li},
  year = {2022},
  doi = {10.1145/3503221.3508418},
  url = {https://doi.org/10.1145/3503221.3508418},
  researchr = {https://researchr.org/publication/HeZAWLSL22},
  cites = {0},
  citedby = {0},
  pages = {120-134},
  booktitle = {PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2 - 6, 2022},
  editor = {Jaejin Lee and Kunal Agrawal and Michael F. Spear},
  publisher = {ACM},
  isbn = {978-1-4503-9204-4},
}