MSViT: Training Multiscale Vision Transformers for Image Retrieval

Xue Li, Jiong Yu, Shaochen Jiang, Hongchun Lu, Ziyang Li. MSViT: Training Multiscale Vision Transformers for Image Retrieval. IEEE Transactions on Multimedia, 26:2809-2823, 2024. [doi]

No reviews for this publication, yet.