Ensemble of One Model: Creating Model Variations for Transformer with Layer Permutation

Andrew Liaw, Jia-Hao Hsu, Chung-Hsien Wu. Ensemble of One Model: Creating Model Variations for Transformer with Layer Permutation. In Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021, Tokyo, Japan, December 14-17, 2021. pages 1026-1030, IEEE, 2021. [doi]

Abstract

Abstract is missing.