Everything at Once - Multi-modal Fusion Transformer for Video Retrieval

Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas 0001, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne. Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pages 19988-19997, IEEE, 2022. [doi]

Authors

Nina Shvetsova

This author has not been identified. Look up 'Nina Shvetsova' in Google

Brian Chen

This author has not been identified. Look up 'Brian Chen' in Google

Andrew Rouditchenko

This author has not been identified. Look up 'Andrew Rouditchenko' in Google

Samuel Thomas 0001

This author has not been identified. Look up 'Samuel Thomas 0001' in Google

Brian Kingsbury

This author has not been identified. Look up 'Brian Kingsbury' in Google

Rogério Feris

This author has not been identified. Look up 'Rogério Feris' in Google

David Harwath

This author has not been identified. Look up 'David Harwath' in Google

James R. Glass

This author has not been identified. Look up 'James R. Glass' in Google

Hilde Kuehne

This author has not been identified. Look up 'Hilde Kuehne' in Google