The following publications are possibly variants of this publication:
- DiT: Self-supervised Pre-training for Document Image TransformerJunlong Li, Yiheng Xu, Tengchao Lv, Lei Cui 0001, Cha Zhang, Furu Wei. mm 2022: 3530-3539 [doi]
- Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image AnalysisYucheng Tang, Dong Yang, Wenqi Li 0001, Holger R. Roth, Bennett A. Landman, Daguang Xu, Vishwesh Nath, Ali Hatamizadeh. cvpr 2022: 20698-20708 [doi]
- MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained TransformersWenhui Wang, Furu Wei, Li Dong 0004, Hangbo Bao, Nan Yang 0002, Ming Zhou 0001. nips 2020: [doi]