MobileWorld
MobileWorld: Measures browser, desktop, mobile, or GUI agents operating in interactive environments.
12rows
success_rateprimary metric
2026-05-27sampled
Metadata
Metrics
Success Rate, GUI-Only Success Rate, User-Interaction Success Rate, MCP Success Rate, Average Completion Steps (lower is better), Average User Queries (lower is better), User Interaction Quality, Average MCP Calls (lower is better)
| Rank | Subject | Success Rate | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-5 + UI-Ins-7B | 51.7% | GPT-5 openai-gpt-5 | Imported | 2026-05-27 |
| 2 | Gemini-3-Pro + UI-Ins-7B | 46.3% | — | Imported | 2026-05-27 |
| 3 | Claude-4.5-Sonnet + UI-Ins-7B | 43.8% | — | Imported | 2026-05-27 |
| 4 | Doubao-1.5-UI-TARS | 20.9% | — | Imported | 2026-05-27 |
| 5 | GELab-Zero-4B | 10.9% | — | Imported | 2026-05-27 |
| 6 | UI-Venus-72B | 10.4% | — | Imported | 2026-05-27 |
| 7 | Qwen3-VL-235B-A22B | 9.5% | — | Imported | 2026-05-27 |
| 8 | Qwen3-VL-32B | 9% | — | Imported | 2026-05-27 |
| 9 | GUI-Owl-32B | 5.5% | — | Imported | 2026-05-27 |
| 10 | Qwen3-VL-8B | 5.5% | — | Imported | 2026-05-27 |
| 11 | UI-Venus-7B | 5.5% | — | Imported | 2026-05-27 |
| 12 | GUI-Owl-7B | 4.5% | — | Imported | 2026-05-27 |
No matching rows.