Advancing Multimodal LLMs by Large-Scale 3D Visual Instruction Dataset Generation

Liu He, Xiao Zeng, Yizhi Song, Albert Y. C. Chen 0001, Lu Xia, Shashwat Verma, Sankalp Dayal, Min Sun, Cheng-Hao Kuo, Daniel G. Aliaga. Advancing Multimodal LLMs by Large-Scale 3D Visual Instruction Dataset Generation. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2026, Tucson, AZ, USA, March 6-10, 2026. pages 5886-5897, IEEE, 2026. [doi]

Abstract

Abstract is missing.