ShowUI-$π$: Flow-based Generative Model for GUI Dexterity
Analysis
Key Takeaways
- •Proposes ShowUI-$π$, a flow-based generative model for GUI control.
- •Introduces a unified discrete-continuous action space for flexible interaction.
- •Employs flow-based action generation for smooth drag trajectories.
- •Creates ScreenDrag, a new benchmark for evaluating GUI agent drag capabilities.
- •Demonstrates superior performance compared to existing proprietary agents.
“ShowUI-$π$ achieves 26.98 with only 450M parameters, underscoring both the difficulty of the task and the effectiveness of our approach.”