ShowUI-$π$: Flow-based Generative Model for GUI Dexterity

Published:Dec 31, 2025 16:51
1 min read
ArXiv

Analysis

This paper introduces ShowUI-$π$, a novel approach to GUI agent control using flow-based generative models. It addresses the limitations of existing agents that rely on discrete click predictions, enabling continuous, closed-loop trajectories like dragging. The work's significance lies in its innovative architecture, the creation of a new benchmark (ScreenDrag), and its demonstration of superior performance compared to existing proprietary agents, highlighting the potential for more human-like interaction in digital environments.

Reference

ShowUI-$π$ achieves 26.98 with only 450M parameters, underscoring both the difficulty of the task and the effectiveness of our approach.