Scale Efficiently: Insights from Pretraining and Finetuning Transformers

Yi Tay, Mostafa Dehghani 0001, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler. Scale Efficiently: Insights from Pretraining and Finetuning Transformers. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. [doi]

Authors

Yi Tay

This author has not been identified. Look up 'Yi Tay' in Google

Mostafa Dehghani 0001

This author has not been identified. Look up 'Mostafa Dehghani 0001' in Google

Jinfeng Rao

This author has not been identified. Look up 'Jinfeng Rao' in Google

William Fedus

This author has not been identified. Look up 'William Fedus' in Google

Samira Abnar

This author has not been identified. Look up 'Samira Abnar' in Google

Hyung Won Chung

This author has not been identified. Look up 'Hyung Won Chung' in Google

Sharan Narang

This author has not been identified. Look up 'Sharan Narang' in Google

Dani Yogatama

This author has not been identified. Look up 'Dani Yogatama' in Google

Ashish Vaswani

This author has not been identified. Look up 'Ashish Vaswani' in Google

Donald Metzler

This author has not been identified. Look up 'Donald Metzler' in Google