COME: Clip-OCR and Master ObjEct for text image captioning

Gang Lv, Yining Sun, Fudong Nian, Maofei Zhu, Wenliang Tang, Zhenzhen Hu. COME: Clip-OCR and Master ObjEct for text image captioning. Image Vision Comput., 136:104751, August 2023. [doi]

@article{LvSNZTH23,
  title = {COME: Clip-OCR and Master ObjEct for text image captioning},
  author = {Gang Lv and Yining Sun and Fudong Nian and Maofei Zhu and Wenliang Tang and Zhenzhen Hu},
  year = {2023},
  month = {August},
  doi = {10.1016/j.imavis.2023.104751},
  url = {https://doi.org/10.1016/j.imavis.2023.104751},
  researchr = {https://researchr.org/publication/LvSNZTH23},
  cites = {0},
  citedby = {0},
  journal = {Image Vision Comput.},
  volume = {136},
  pages = {104751},
}