Video Token Sparsification for Efficient Multimodal LLMs in Driving Visual Question Answering

Yunsheng Ma, Amr Abdelraouf, Rohit Gupta, Ahmadreza Moradipari, Ziran Wang, Kyungtae Han. Video Token Sparsification for Efficient Multimodal LLMs in Driving Visual Question Answering. In IEEE Intelligent Vehicles Symposium, IV 2025, Cluj-Napoca, Romania, June 22-25, 2025. pages 2235-2242, IEEE, 2025. [doi]

Abstract

Abstract is missing.