Searching for Efficient Transformers for Language Modeling

David R. So, Wojciech Manke, Hanxiao Liu, Zihang Dai, Noam Shazeer, Quoc V. Le. Searching for Efficient Transformers for Language Modeling. In Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pages 6010-6022, 2021. [doi]

Authors

David R. So

This author has not been identified. Look up 'David R. So' in Google

Wojciech Manke

This author has not been identified. Look up 'Wojciech Manke' in Google

Hanxiao Liu

This author has not been identified. Look up 'Hanxiao Liu' in Google

Zihang Dai

This author has not been identified. Look up 'Zihang Dai' in Google

Noam Shazeer

This author has not been identified. Look up 'Noam Shazeer' in Google

Quoc V. Le

This author has not been identified. Look up 'Quoc V. Le' in Google