Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas 0001, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang. Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. pages 7992-8001, IEEE, 2021. [doi]

Abstract

Abstract is missing.