Optimizing Deeper Transformers on Small Datasets

Peng Xu, Dhruv Kumar 0005, Wei Yang, Wenjie Zi, Keyi Tang, Chenyang Huang 0001, Jackie Chi Kit Cheung, Simon J. D. Prince, Yanshuai Cao. Optimizing Deeper Transformers on Small Datasets. In Chengqing Zong, Fei Xia, Wenjie Li 0002, Roberto Navigli, editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021. pages 2089-2102, Association for Computational Linguistics, 2021. [doi]

@inproceedings{Xu0YZT0CPC20,
  title = {Optimizing Deeper Transformers on Small Datasets},
  author = {Peng Xu and Dhruv Kumar 0005 and Wei Yang and Wenjie Zi and Keyi Tang and Chenyang Huang 0001 and Jackie Chi Kit Cheung and Simon J. D. Prince and Yanshuai Cao},
  year = {2021},
  url = {https://aclanthology.org/2021.acl-long.163},
  researchr = {https://researchr.org/publication/Xu0YZT0CPC20},
  cites = {0},
  citedby = {0},
  pages = {2089-2102},
  booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021},
  editor = {Chengqing Zong and Fei Xia and Wenjie Li 0002 and Roberto Navigli},
  publisher = {Association for Computational Linguistics},
  isbn = {978-1-954085-52-7},
}