Multimodal fusion with vision-language-action models for robotic manipulation: A systematic review

Muhayy Ud Din, Waseem Akram 0001, Lyes Saad Saoud, Jan Rosell, Irfan Hussain. Multimodal fusion with vision-language-action models for robotic manipulation: A systematic review. Information Fusion, 129:104062, 2026. [doi]

Abstract

Abstract is missing.