Instruction-ViT: Multi-modal prompts for instruction learning in vision transformer

Zhenxiang Xiao, Yuzhong Chen, Junjie Yao, Lu Zhang 0050, Zhengliang Liu, Zihao Wu 0001, Xiaowei Yu, Yi Pan, Lin Zhao, Chong Ma, Xinyu Liu, Wei Liu, Xiang Li 0001, Yixuan Yuan, Dinggang Shen, Dajiang Zhu, Dezhong Yao 0001, Tianming Liu, Xi Jiang 0001. Instruction-ViT: Multi-modal prompts for instruction learning in vision transformer. Information Fusion, 104:102204, April 2024. [doi]

Abstract

Abstract is missing.