Angela Fan, Edouard Grave, Armand Joulin. Reducing Transformer Depth on Demand with Structured Dropout. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. [doi]
No references recorded for this publication.
No citations of this publication recorded.