Research Paper#GUI Agents, Flow-based Generative Models, Dexterous Manipulation🔬 ResearchAnalyzed: Jan 3, 2026 06:18
ShowUI-$π$: Flow-based Generative Model for GUI Dexterity
Published:Dec 31, 2025 16:51
•1 min read
•ArXiv
Analysis
This paper introduces ShowUI-$π$, a novel approach to GUI agent control using flow-based generative models. It addresses the limitations of existing agents that rely on discrete click predictions, enabling continuous, closed-loop trajectories like dragging. The work's significance lies in its innovative architecture, the creation of a new benchmark (ScreenDrag), and its demonstration of superior performance compared to existing proprietary agents, highlighting the potential for more human-like interaction in digital environments.
Key Takeaways
- •Proposes ShowUI-$π$, a flow-based generative model for GUI control.
- •Introduces a unified discrete-continuous action space for flexible interaction.
- •Employs flow-based action generation for smooth drag trajectories.
- •Creates ScreenDrag, a new benchmark for evaluating GUI agent drag capabilities.
- •Demonstrates superior performance compared to existing proprietary agents.
Reference
“ShowUI-$π$ achieves 26.98 with only 450M parameters, underscoring both the difficulty of the task and the effectiveness of our approach.”