MSViT: Training Multiscale Vision Transformers for Image Retrieval

Xue Li, Jiong Yu, Shaochen Jiang, Hongchun Lu, Ziyang Li. MSViT: Training Multiscale Vision Transformers for Image Retrieval. IEEE Transactions on Multimedia, 26:2809-2823, 2024. [doi]

Abstract

Abstract is missing.