Everything at Once - Multi-modal Fusion Transformer for Video Retrieval

researchr

You are not signed in
Sign in
Sign up

Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas 0001, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne. Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pages 19988-19997, IEEE, 2022. [doi]

@inproceedings{ShvetsovaCR0KFH22,
  title = {Everything at Once - Multi-modal Fusion Transformer for Video Retrieval},
  author = {Nina Shvetsova and Brian Chen and Andrew Rouditchenko and Samuel Thomas 0001 and Brian Kingsbury and Rogério Feris and David Harwath and James R. Glass and Hilde Kuehne},
  year = {2022},
  doi = {10.1109/CVPR52688.2022.01939},
  url = {https://doi.org/10.1109/CVPR52688.2022.01939},
  researchr = {https://researchr.org/publication/ShvetsovaCR0KFH22},
  cites = {0},
  citedby = {0},
  pages = {19988-19997},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022},
  publisher = {IEEE},
  isbn = {978-1-6654-6946-3},
}

External Links

Cite Key

Statistics

PDF

Researchr

Everything at Once - Multi-modal Fusion Transformer for Video Retrieval