SWE-bench Full
Original SWE-bench leaderboard over 2,294 real GitHub issue resolution tasks.
24rows
resolvedprimary metric
2025-12-19sampled
Metadata
Metrics
Resolved
| Rank | Subject | Resolved | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Sonar Foundation Agent + Claude 4.5 Opus | 52.62% | — | Imported | 2025-12-19 |
| 2 | Salesforce AI Research SAGE (bash-only) | 44.25% | — | Imported | 2025-12-19 |
| 3 | Atlassian Rovo Dev (2025-06-05) | 41.98% | — | Imported | 2025-12-19 |
| 4 | Amazon Q Developer Agent (v20250405-dev) | 37.1% | — | Imported | 2025-12-19 |
| 5 | SWE-agent 1.0 (Claude 3.7 Sonnet) | 33.83% | — | Imported | 2025-12-19 |
| 6 | Amazon Q Developer Agent (v20241202-dev) | 29.99% | — | Imported | 2025-12-19 |
| 7 | OpenHands + CodeAct v2.1 (claude-3-5-sonnet-20241022) | 29.38% | — | Imported | 2025-12-19 |
| 8 | AutoCodeRover-v2.0 (Claude-3.5-Sonnet-20241022) | 24.89% | — | Imported | 2025-12-19 |
| 9 | Honeycomb | 22.06% | — | Imported | 2025-12-19 |
| 10 | Amazon Q Developer Agent (v20240719-dev) | 19.75% | — | Imported | 2025-12-19 |
| 11 | Factory Code Droid | 19.27% | — | Imported | 2025-12-19 |
| 12 | AutoCodeRover (v20240620) + GPT 4o (2024-05-13) | 18.83% | — | Imported | 2025-12-19 |
| 13 | SWE-agent + Claude 3.5 Sonnet | 18.13% | — | Imported | 2025-12-19 |
| 14 | AppMap Navie + GPT 4o (2024-05-13) | 14.6% | — | Imported | 2025-12-19 |
| 15 | Amazon Q Developer Agent (v20240430-dev) | 13.82% | — | Imported | 2025-12-19 |
| 16 | SWE-agent + GPT 4 (1106) | 12.47% | — | Imported | 2025-12-19 |
| 17 | SWE-agent + GPT 4o (2024-05-13) | 11.99% | — | Imported | 2025-12-19 |
| 18 | SWE-agent + Claude 3 Opus | 10.51% | — | Imported | 2025-12-19 |
| 19 | RAG + Claude 3 Opus | 3.79% | — | Imported | 2025-12-19 |
| 20 | RAG + Claude 2 | 1.96% | — | Imported | 2025-12-19 |
| 21 | RAG + GPT 4 (1106) | 1.31% | — | Imported | 2025-12-19 |
| 22 | RAG + SWE-Llama 13B | 0.7% | — | Imported | 2025-12-19 |
| 23 | RAG + SWE-Llama 7B | 0.7% | — | Imported | 2025-12-19 |
| 24 | RAG + ChatGPT 3.5 | 0.17% | — | Imported | 2025-12-19 |
No matching rows.