Optimizing Deeper Transformers on Small Datasets

Peng Xu, Dhruv Kumar 0005, Wei Yang, Wenjie Zi, Keyi Tang, Chenyang Huang 0001, Jackie Chi Kit Cheung, Simon J. D. Prince, Yanshuai Cao. Optimizing Deeper Transformers on Small Datasets. In Chengqing Zong, Fei Xia, Wenjie Li 0002, Roberto Navigli, editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021. pages 2089-2102, Association for Computational Linguistics, 2021. [doi]

Authors

Peng Xu

This author has not been identified. Look up 'Peng Xu' in Google

Dhruv Kumar 0005

This author has not been identified. Look up 'Dhruv Kumar 0005' in Google

Wei Yang

This author has not been identified. Look up 'Wei Yang' in Google

Wenjie Zi

This author has not been identified. Look up 'Wenjie Zi' in Google

Keyi Tang

This author has not been identified. Look up 'Keyi Tang' in Google

Chenyang Huang 0001

This author has not been identified. Look up 'Chenyang Huang 0001' in Google

Jackie Chi Kit Cheung

This author has not been identified. Look up 'Jackie Chi Kit Cheung' in Google

Simon J. D. Prince

This author has not been identified. Look up 'Simon J. D. Prince' in Google

Yanshuai Cao

This author has not been identified. Look up 'Yanshuai Cao' in Google