The following publications are possibly variants of this publication:
- What Makes for Good Tokenizers in Vision Transformer?Shengju Qian, Yi Zhu 0001, Wenbo Li, Mu Li 0003, Jiaya Jia. pami, 45(11):13011-13023, November 2023. [doi]
- Convolutional Embedding Makes Hierarchical Vision Transformer StrongerCong Wang, Hongmin Xu, Xiong Zhang, Li Wang, Zhitong Zheng, Haifeng Liu. eccv 2022: 739-756 [doi]