OpenHands Index

Holistic software-engineering agent benchmark from OpenHands covering issue resolution, multimodal bug fixing, app creation, test generation, and information gathering tasks.

34rows
openhands_index_scoreprimary metric
2026-05-06sampled

Metadata

Metrics

OpenHands Index score, Task scores included, Total cost per instance (lower is better), Average runtime (lower is better), SWE-Bench, SWE-Bench Multimodal, SWT-Bench, Commit0, GAIA

Latest Results

Rows are agent/model submissions from OpenHands Index. Score is an unweighted mean over available task scores because some submissions do not include every task.

Rank Subject OpenHands Index score Model Match Provenance Sampled
1 OpenHands / claude-opus-4-7 68.18 Imported 2026-05-06
2 OpenHands Sub-agents / claude-opus-4-6 67.80 Imported 2026-05-06
3 OpenHands / claude-opus-4-6 66.72 Imported 2026-05-06
4 Gemini CLI / Gemini-3.1-Pro 65.02 Imported 2026-05-06
5 OpenHands / GPT-5.4 64.28 Imported 2026-05-06
6 Codex / GPT-5.4 63.64 Imported 2026-05-06
7 Codex / GPT-5.5 63.20 Imported 2026-05-06
8 OpenHands / claude-opus-4-5 60.58 Imported 2026-05-06
9 OpenHands / GPT-5.2 58.84 Imported 2026-05-06
10 Claude Code / claude-opus-4-7 58.76 Imported 2026-05-06
11 OpenHands / GPT-5.2-Codex 58.28 Imported 2026-05-06
12 OpenHands / GLM-5.1 58.24 Imported 2026-05-06
13 Claude Code / claude-opus-4-6 57.92 Imported 2026-05-06
14 OpenHands / Gemini-3.1-Pro 56.98 Imported 2026-05-06
15 Gemini CLI / Gemini-3-Flash 54.76 Imported 2026-05-06
16 Claude Code / claude-sonnet-4-5 54.64 Imported 2026-05-06
17 OpenHands / claude-sonnet-4-5 53 Imported 2026-05-06
18 OpenHands / Qwen3.6-Plus 52.86 Imported 2026-05-06
19 OpenHands Sub-agents / claude-sonnet-4-5 52.56 Imported 2026-05-06
20 OpenHands / GLM-5 49.44 Imported 2026-05-06
21 OpenHands / Kimi-K2.5 49.18 Imported 2026-05-06
22 OpenHands / Gemini-3-Pro 49.04 Imported 2026-05-06
23 OpenHands / Gemini-3-Flash 49 Imported 2026-05-06
24 OpenHands / DeepSeek-V3.2-Reasoner 45.68 Imported 2026-05-06
25 OpenHands / MiniMax-M2.5 45.22 Imported 2026-05-06
26 OpenHands / claude-sonnet-4-6 44.52 Imported 2026-05-06
27 OpenHands / Minimax-2.7 43.38 Imported 2026-05-06
28 OpenHands / GLM-4.7 42.26 Imported 2026-05-06
29 OpenHands / MiniMax-M2.1 41.16 Imported 2026-05-06
30 OpenHands / Kimi-K2-Thinking 41 Imported 2026-05-06
31 OpenHands / Qwen3.5-Flash 38.08 Imported 2026-05-06
32 OpenHands / Nemotron-3-Super 36.16 Imported 2026-05-06
33 OpenHands / Qwen3-Coder-480B 30.94 Imported 2026-05-06
34 OpenHands / Nemotron-3-Nano 15.48 Imported 2026-05-06