Pengchuan Zhang, Xiyang Dai, Jianwei Yang, Bin Xiao, Lu Yuan, Lei Zhang 0001, Jianfeng Gao. Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. pages 2978-2988, IEEE, 2021. [doi]
@inproceedings{ZhangDYXY0G21, title = {Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding}, author = {Pengchuan Zhang and Xiyang Dai and Jianwei Yang and Bin Xiao and Lu Yuan and Lei Zhang 0001 and Jianfeng Gao}, year = {2021}, doi = {10.1109/ICCV48922.2021.00299}, url = {https://doi.org/10.1109/ICCV48922.2021.00299}, researchr = {https://researchr.org/publication/ZhangDYXY0G21}, cites = {0}, citedby = {0}, pages = {2978-2988}, booktitle = {2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021}, publisher = {IEEE}, isbn = {978-1-6654-2812-5}, }