Do Video Language Models really understand the video contexts?

Jeongwan Shin, Jinhyeong Lim, Hyeyoung Park. Do Video Language Models really understand the video contexts?. In Abteen Ebrahimi, Samar Haider, Emmy Liu, Sammar Haider, Maria Leonor Pacheco, Shira Wein, editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025 - Volume 4: Student Research Workshop, Albuquerque, NM, USA, April 30 - May 1, 2025. pages 408-417, Association for Computational Linguistics, 2025. [doi]

Authors

Jeongwan Shin

This author has not been identified. Look up 'Jeongwan Shin' in Google

Jinhyeong Lim

This author has not been identified. Look up 'Jinhyeong Lim' in Google

Hyeyoung Park

This author has not been identified. Look up 'Hyeyoung Park' in Google