Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Mohammad Shoeybi, Ming-Yu Liu 0001, Yuke Zhu, Bryan Catanzaro, Chaowei Xiao, Anima Anandkumar. Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 11844-11857, Association for Computational Linguistics, 2023. [doi]

@inproceedings{YangPLKNHFYLLS023,
  title = {Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning},
  author = {Zhuolin Yang and Wei Ping and Zihan Liu and Vijay Korthikanti and Weili Nie and De-An Huang and Linxi Fan and Zhiding Yu and Shiyi Lan and Bo Li and Mohammad Shoeybi and Ming-Yu Liu 0001 and Yuke Zhu and Bryan Catanzaro and Chaowei Xiao and Anima Anandkumar},
  year = {2023},
  url = {https://aclanthology.org/2023.findings-emnlp.793},
  researchr = {https://researchr.org/publication/YangPLKNHFYLLS023},
  cites = {0},
  citedby = {0},
  pages = {11844-11857},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023},
  editor = {Houda Bouamor and Juan Pino 0001 and Kalika Bali},
  publisher = {Association for Computational Linguistics},
  isbn = {979-8-89176-061-5},
}