Reveal: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory

Ziniu Hu, Ahmet Iscen, Chen Sun 0002, Zirui Wang, Kai-Wei Chang, Yizhou Sun, Cordelia Schmid, David A. Ross, Alireza Fathi. Reveal: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pages 23369-23379, IEEE, 2023. [doi]

Authors

Ziniu Hu

This author has not been identified. Look up 'Ziniu Hu' in Google

Ahmet Iscen

This author has not been identified. Look up 'Ahmet Iscen' in Google

Chen Sun 0002

This author has not been identified. Look up 'Chen Sun 0002' in Google

Zirui Wang

This author has not been identified. Look up 'Zirui Wang' in Google

Kai-Wei Chang

This author has not been identified. Look up 'Kai-Wei Chang' in Google

Yizhou Sun

This author has not been identified. Look up 'Yizhou Sun' in Google

Cordelia Schmid

This author has not been identified. Look up 'Cordelia Schmid' in Google

David A. Ross

This author has not been identified. Look up 'David A. Ross' in Google

Alireza Fathi

This author has not been identified. Look up 'Alireza Fathi' in Google