Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

Yi Tay, Mostafa Dehghani 0001, Samira Abnar, Hyung Won Chung, William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran 0002, Dani Yogatama, Donald Metzler. Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 12342-12364, Association for Computational Linguistics, 2023. [doi]

Authors

Yi Tay

This author has not been identified. Look up 'Yi Tay' in Google

Mostafa Dehghani 0001

This author has not been identified. Look up 'Mostafa Dehghani 0001' in Google

Samira Abnar

This author has not been identified. Look up 'Samira Abnar' in Google

Hyung Won Chung

This author has not been identified. Look up 'Hyung Won Chung' in Google

William Fedus

This author has not been identified. Look up 'William Fedus' in Google

Jinfeng Rao

This author has not been identified. Look up 'Jinfeng Rao' in Google

Sharan Narang

This author has not been identified. Look up 'Sharan Narang' in Google

Vinh Q. Tran 0002

This author has not been identified. Look up 'Vinh Q. Tran 0002' in Google

Dani Yogatama

This author has not been identified. Look up 'Dani Yogatama' in Google

Donald Metzler

This author has not been identified. Look up 'Donald Metzler' in Google