The following publications are possibly variants of this publication:
- Modal Interaction-Enhanced Prompt Learning by Transformer Decoder for Vision-Language ModelsMingyue Liu, Honggang Zhao, Longfei Ma, Xiang Li, Yucheng Ji, Mingyong Li. ksem 2023: 163-174 [doi]
- Instruction-ViT: Multi-modal prompts for instruction learning in vision transformerZhenxiang Xiao, Yuzhong Chen, Junjie Yao, Lu Zhang 0050, Zhengliang Liu, Zihao Wu 0001, Xiaowei Yu, Yi Pan, Lin Zhao, Chong Ma, Xinyu Liu, Wei Liu, Xiang Li 0001, Yixuan Yuan, Dinggang Shen, Dajiang Zhu, Dezhong Yao 0001, Tianming Liu, Xi Jiang 0001. inffus, 104:102204, April 2024. [doi]
- Learning Domain Invariant Prompt for Vision-Language ModelsCairong Zhao, Yubin Wang, Xinyang Jiang, Yifei Shen, Kaitao Song, Dongsheng Li 0002, Duoqian Miao. TIP, 33:1348-1360, 2024. [doi]