CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

Chun-Fu (Richard) Chen, Quanfu Fan, Rameswar Panda. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. pages 347-356, IEEE, 2021. [doi]

Authors

Chun-Fu (Richard) Chen

This author has not been identified. Look up 'Chun-Fu (Richard) Chen' in Google

Quanfu Fan

This author has not been identified. Look up 'Quanfu Fan' in Google

Rameswar Panda

This author has not been identified. Look up 'Rameswar Panda' in Google