ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Stan Weixian Lei, Lijuan Wang, Mike Zheng Shou. ShowUI: One Vision-Language-Action Model for GUI Visual Agent. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pages 19498-19508, Computer Vision Foundation / IEEE, 2025. [doi]

Abstract

Abstract is missing.