COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

Haoyu Lu, Nanyi Fei, Yuqi Huo, Yizhao Gao, Zhiwu Lu 0001, Ji-Rong Wen. COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pages 15671-15680, IEEE, 2022. [doi]

@inproceedings{LuFHG0W22,
  title = {COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval},
  author = {Haoyu Lu and Nanyi Fei and Yuqi Huo and Yizhao Gao and Zhiwu Lu 0001 and Ji-Rong Wen},
  year = {2022},
  doi = {10.1109/CVPR52688.2022.01524},
  url = {https://doi.org/10.1109/CVPR52688.2022.01524},
  researchr = {https://researchr.org/publication/LuFHG0W22},
  cites = {0},
  citedby = {0},
  pages = {15671-15680},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022},
  publisher = {IEEE},
  isbn = {978-1-6654-6946-3},
}