The following publications are possibly variants of this publication:
- Efficient Medical Images Text Detection with Vision-Language Pre-training ApproachTianyang Li, Jinxu Bai, Qingzhu Wang, Hanwen Xu. acml 2023: 755-770 [doi]
- COPA : Efficient Vision-Language Pre-training through Collaborative Object- and Patch-Text AlignmentChaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Ji Zhang. mm 2023: 4480-4491 [doi]
- BUS : Efficient and Effective Vision-language Pre-training with Bottom-Up Patch SummarizationChaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang. iccv 2023: 2888-2898 [doi]
- TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-trainingChaoya Jiang, Wei Ye, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang 0011, Shikun Zhang. AAAI 2024: 2489-2497 [doi]
- On Efficient Transformer-Based Image Pre-training for Low-Level VisionWenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu. IJCAI 2023: 1089-1097 [doi]