Searching for Efficient Transformers for Language Modeling

David R. So, Wojciech Manke, Hanxiao Liu, Zihang Dai, Noam Shazeer, Quoc V. Le. Searching for Efficient Transformers for Language Modeling. In Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pages 6010-6022, 2021. [doi]

@inproceedings{SoMLDSL21,
  title = {Searching for Efficient Transformers for Language Modeling},
  author = {David R. So and Wojciech Manke and Hanxiao Liu and Zihang Dai and Noam Shazeer and Quoc V. Le},
  year = {2021},
  url = {https://proceedings.neurips.cc/paper/2021/hash/2f3c6a4cd8af177f6456e7e51a916ff3-Abstract.html},
  researchr = {https://researchr.org/publication/SoMLDSL21},
  cites = {0},
  citedby = {0},
  pages = {6010-6022},
  booktitle = {Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual},
  editor = {Marc'Aurelio Ranzato and Alina Beygelzimer and Yann N. Dauphin and Percy Liang and Jennifer Wortman Vaughan},
}