Verbs in Action: Improving verb understanding in video-language models

Liliane Momeni, Mathilde Caron, Arsha Nagrani, Andrew Zisserman, Cordelia Schmid. Verbs in Action: Improving verb understanding in video-language models. In IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. pages 15533-15545, IEEE, 2023. [doi]

Abstract

Abstract is missing.