GeoMag: A Vision-Language Model for Pixel-level Fine-Grained Remote Sensing Image Parsing

Xianzhi Ma, Jianhui Li, Changhua Pei, Hao Liu 0034. GeoMag: A Vision-Language Model for Pixel-level Fine-Grained Remote Sensing Image Parsing. In Cathal Gurrin, Klaus Schoeffmann, Min Zhang, Luca Rossetto, Stevan Rudinac, Duc-Tien Dang-Nguyen, Wen-Huang Cheng, Phoebe Chen, Jenny Benois-Pineau, editors, Proceedings of the 33rd ACM International Conference on Multimedia, MM 2025, Dublin, Ireland, October 27-31, 2025. pages 5441-5450, ACM, 2025. [doi]

@inproceedings{MaLP025,
  title = {GeoMag: A Vision-Language Model for Pixel-level Fine-Grained Remote Sensing Image Parsing},
  author = {Xianzhi Ma and Jianhui Li and Changhua Pei and Hao Liu 0034},
  year = {2025},
  doi = {10.1145/3746027.3754559},
  url = {https://doi.org/10.1145/3746027.3754559},
  researchr = {https://researchr.org/publication/MaLP025},
  cites = {0},
  citedby = {0},
  pages = {5441-5450},
  booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia, MM 2025, Dublin, Ireland, October 27-31, 2025},
  editor = {Cathal Gurrin and Klaus Schoeffmann and Min Zhang and Luca Rossetto and Stevan Rudinac and Duc-Tien Dang-Nguyen and Wen-Huang Cheng and Phoebe Chen and Jenny Benois-Pineau},
  publisher = {ACM},
  isbn = {979-8-4007-2035-2},
}