Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference

Zhihang Lin, Mingbao Lin, Luxi Lin, Rongrong Ji. Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference. In Toby Walsh, Julie Shah, Zico Kolter, editors, AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25 - March 4, 2025, Philadelphia, PA, USA. pages 5334-5342, AAAI Press, 2025. [doi]

Bibliographies