Transcending Scaling Laws with 0.1% Extra Compute

Yi Tay, Jason Wei, Hyung Won Chung, Vinh Q. Tran 0002, David R. So, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc Le, Mostafa Dehghani 0001. Transcending Scaling Laws with 0.1% Extra Compute. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023. pages 1471-1486, Association for Computational Linguistics, 2023. [doi]

Authors

Yi Tay

This author has not been identified. Look up 'Yi Tay' in Google

Jason Wei

This author has not been identified. Look up 'Jason Wei' in Google

Hyung Won Chung

This author has not been identified. Look up 'Hyung Won Chung' in Google

Vinh Q. Tran 0002

This author has not been identified. Look up 'Vinh Q. Tran 0002' in Google

David R. So

This author has not been identified. Look up 'David R. So' in Google

Siamak Shakeri

This author has not been identified. Look up 'Siamak Shakeri' in Google

Xavier Garcia

This author has not been identified. Look up 'Xavier Garcia' in Google

Huaixiu Steven Zheng

This author has not been identified. Look up 'Huaixiu Steven Zheng' in Google

Jinfeng Rao

This author has not been identified. Look up 'Jinfeng Rao' in Google

Aakanksha Chowdhery

This author has not been identified. Look up 'Aakanksha Chowdhery' in Google

Denny Zhou

This author has not been identified. Look up 'Denny Zhou' in Google

Donald Metzler

This author has not been identified. Look up 'Donald Metzler' in Google

Slav Petrov

This author has not been identified. Look up 'Slav Petrov' in Google

Neil Houlsby

This author has not been identified. Look up 'Neil Houlsby' in Google

Quoc Le

This author has not been identified. Look up 'Quoc Le' in Google

Mostafa Dehghani 0001

This author has not been identified. Look up 'Mostafa Dehghani 0001' in Google