Video-LLaVA: Learning United Visual Representation by Alignment Before Projection - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Bin Lin, Yang Ye, Bin Zhu, Jiaxi Cui, Munan Ning, Peng Jin, Li Yuan 0001. Video-LLaVA: Learning United Visual Representation by Alignment Before Projection. In Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen, editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Miami, FL, USA, November 12-16, 2024. pages 5971-5984, Association for Computational Linguistics, 2024. [doi]

This author has not been identified. Look up 'Bin Lin' in GoogleThis author has not been identified. Look up 'Yang Ye' in GoogleThis author has not been identified. Look up 'Bin Zhu' in GoogleThis author has not been identified. Look up 'Jiaxi Cui' in GoogleThis author has not been identified. Look up 'Munan Ning' in GoogleThis author has not been identified. Look up 'Peng Jin' in GoogleThis author has not been identified. Look up 'Li Yuan 0001' in Google

runs on WebDSL