OpenHands Index
Holistic software-engineering agent benchmark from OpenHands covering issue resolution, multimodal bug fixing, app creation, test generation, and information gathering tasks.
34rows
openhands_index_scoreprimary metric
2026-05-06sampled
Metadata
Metrics
OpenHands Index score, Task scores included, Total cost per instance (lower is better), Average runtime (lower is better), SWE-Bench, SWE-Bench Multimodal, SWT-Bench, Commit0, GAIA
| Rank | Subject | OpenHands Index score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | OpenHands / claude-opus-4-7 | 68.18 | — | Imported | 2026-05-06 |
| 2 | OpenHands Sub-agents / claude-opus-4-6 | 67.80 | — | Imported | 2026-05-06 |
| 3 | OpenHands / claude-opus-4-6 | 66.72 | — | Imported | 2026-05-06 |
| 4 | Gemini CLI / Gemini-3.1-Pro | 65.02 | — | Imported | 2026-05-06 |
| 5 | OpenHands / GPT-5.4 | 64.28 | — | Imported | 2026-05-06 |
| 6 | Codex / GPT-5.4 | 63.64 | — | Imported | 2026-05-06 |
| 7 | Codex / GPT-5.5 | 63.20 | — | Imported | 2026-05-06 |
| 8 | OpenHands / claude-opus-4-5 | 60.58 | — | Imported | 2026-05-06 |
| 9 | OpenHands / GPT-5.2 | 58.84 | — | Imported | 2026-05-06 |
| 10 | Claude Code / claude-opus-4-7 | 58.76 | — | Imported | 2026-05-06 |
| 11 | OpenHands / GPT-5.2-Codex | 58.28 | — | Imported | 2026-05-06 |
| 12 | OpenHands / GLM-5.1 | 58.24 | — | Imported | 2026-05-06 |
| 13 | Claude Code / claude-opus-4-6 | 57.92 | — | Imported | 2026-05-06 |
| 14 | OpenHands / Gemini-3.1-Pro | 56.98 | — | Imported | 2026-05-06 |
| 15 | Gemini CLI / Gemini-3-Flash | 54.76 | — | Imported | 2026-05-06 |
| 16 | Claude Code / claude-sonnet-4-5 | 54.64 | — | Imported | 2026-05-06 |
| 17 | OpenHands / claude-sonnet-4-5 | 53 | — | Imported | 2026-05-06 |
| 18 | OpenHands / Qwen3.6-Plus | 52.86 | — | Imported | 2026-05-06 |
| 19 | OpenHands Sub-agents / claude-sonnet-4-5 | 52.56 | — | Imported | 2026-05-06 |
| 20 | OpenHands / GLM-5 | 49.44 | — | Imported | 2026-05-06 |
| 21 | OpenHands / Kimi-K2.5 | 49.18 | — | Imported | 2026-05-06 |
| 22 | OpenHands / Gemini-3-Pro | 49.04 | — | Imported | 2026-05-06 |
| 23 | OpenHands / Gemini-3-Flash | 49 | — | Imported | 2026-05-06 |
| 24 | OpenHands / DeepSeek-V3.2-Reasoner | 45.68 | — | Imported | 2026-05-06 |
| 25 | OpenHands / MiniMax-M2.5 | 45.22 | — | Imported | 2026-05-06 |
| 26 | OpenHands / claude-sonnet-4-6 | 44.52 | — | Imported | 2026-05-06 |
| 27 | OpenHands / Minimax-2.7 | 43.38 | — | Imported | 2026-05-06 |
| 28 | OpenHands / GLM-4.7 | 42.26 | — | Imported | 2026-05-06 |
| 29 | OpenHands / MiniMax-M2.1 | 41.16 | — | Imported | 2026-05-06 |
| 30 | OpenHands / Kimi-K2-Thinking | 41 | — | Imported | 2026-05-06 |
| 31 | OpenHands / Qwen3.5-Flash | 38.08 | — | Imported | 2026-05-06 |
| 32 | OpenHands / Nemotron-3-Super | 36.16 | — | Imported | 2026-05-06 |
| 33 | OpenHands / Qwen3-Coder-480B | 30.94 | — | Imported | 2026-05-06 |
| 34 | OpenHands / Nemotron-3-Nano | 15.48 | — | Imported | 2026-05-06 |
No matching rows.