Linchao Zhu, Yi Yang. ActBERT: Learning Global-Local Video-Text Representations. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. pages 8743-8752, IEEE, 2020. [doi]
Abstract is missing.