RS-MoE: A Vision-Language Model With Mixture of Experts for Remote Sensing Image Captioning and Visual Question Answering

Hui Lin, Danfeng Hong, Shuhang Ge, Chuyao Luo, Kai Jiang, Hao Jin, Congcong Wen. RS-MoE: A Vision-Language Model With Mixture of Experts for Remote Sensing Image Captioning and Visual Question Answering. IEEE T. Geoscience and Remote Sensing, 63:1-18, 2025. [doi]

Abstract

Abstract is missing.