Synthesizer: Rethinking Self-Attention for Transformer Models

Yi Tay, Dara Bahri, Donald Metzler, Da-Cheng Juan, Zhe Zhao, Che Zheng. Synthesizer: Rethinking Self-Attention for Transformer Models. In Marina Meila, Tong Zhang 0001, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event. Volume 139 of Proceedings of Machine Learning Research, pages 10183-10192, PMLR, 2021. [doi]

Authors

Yi Tay

This author has not been identified. Look up 'Yi Tay' in Google

Dara Bahri

This author has not been identified. Look up 'Dara Bahri' in Google

Donald Metzler

This author has not been identified. Look up 'Donald Metzler' in Google

Da-Cheng Juan

This author has not been identified. Look up 'Da-Cheng Juan' in Google

Zhe Zhao

This author has not been identified. Look up 'Zhe Zhao' in Google

Che Zheng

This author has not been identified. Look up 'Che Zheng' in Google