ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Stan Weixian Lei, Lijuan Wang, Mike Zheng Shou. ShowUI: One Vision-Language-Action Model for GUI Visual Agent. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pages 19498-19508, Computer Vision Foundation / IEEE, 2025. [doi]

Authors

Kevin Qinghong Lin

This author has not been identified. Look up 'Kevin Qinghong Lin' in Google

Linjie Li

This author has not been identified. Look up 'Linjie Li' in Google

Difei Gao

This author has not been identified. Look up 'Difei Gao' in Google

Zhengyuan Yang

This author has not been identified. Look up 'Zhengyuan Yang' in Google

Shiwei Wu

This author has not been identified. Look up 'Shiwei Wu' in Google

Zechen Bai

This author has not been identified. Look up 'Zechen Bai' in Google

Stan Weixian Lei

This author has not been identified. Look up 'Stan Weixian Lei' in Google

Lijuan Wang

This author has not been identified. Look up 'Lijuan Wang' in Google

Mike Zheng Shou

This author has not been identified. Look up 'Mike Zheng Shou' in Google