Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, Cordelia Schmid. Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pages 10714-10726, IEEE, 2023. [doi]

This author has not been identified. Look up 'Antoine Yang' in GoogleThis author has not been identified. Look up 'Arsha Nagrani' in GoogleThis author has not been identified. Look up 'Paul Hongsuck Seo' in GoogleThis author has not been identified. Look up 'Antoine Miech' in GoogleThis author has not been identified. Look up 'Jordi Pont-Tuset' in GoogleThis author has not been identified. Look up 'Ivan Laptev' in GoogleThis author has not been identified. Look up 'Josef Sivic' in GoogleThis author has not been identified. Look up 'Cordelia Schmid' in Google

runs on WebDSL