SWE-bench Lite

Curated 300-instance SWE-bench subset for lower-cost evaluation of issue-resolving agents.

84rows
resolvedprimary metric
2025-09-11sampled

Metadata

Metrics

Resolved

Latest Results

Official SWE-bench Lite leaderboard rows. Each entry reports percent resolved out of 300 curated SWE-bench tasks; rows are agent systems or scaffolds, not pure base-model-only scores.

Rank Subject Resolved Model Match Provenance Sampled
1 ExpeRepair-v1.0 + Claude 4 Sonnet 60.33% Imported 2025-09-11
2 Refact.ai Agent 60% Imported 2025-09-11
3 KGCompass + Claude 4 Sonnet (20250514) 58.33% Imported 2025-09-11
4 SWE-agent + Claude 4 Sonnet 56.67% Imported 2025-09-11
5 Isoform 55% Imported 2025-09-11
6 SemAgent_Multi-v1.0 51.67% Imported 2025-09-11
7 Isea 51.33% Imported 2025-09-11
8 EntroPO + R2E + Qwen3-Coder-30B-A3B-Instruct 49.67% Imported 2025-09-11
9 Blackbox AI Agent 49% Imported 2025-09-11
10 Codev 49% Imported 2025-09-11
11 Gru(2024-12-08) 48.67% Imported 2025-09-11
12 ExpeRepair-v1.0 48.33% Imported 2025-09-11
13 Globant Code Fixer Agent 48.33% Imported 2025-09-11
14 SWE-agent + Claude 3.7 Sonnet 48% Imported 2025-09-11
15 devlo 47.33% Imported 2025-09-11
16 DARS Agent 47% Imported 2025-09-11
17 KGCompass + Claude 3.5 Sonnet (20241022) 46% Imported 2025-09-11
18 EntroPO + R2E + Qwen3-Coder-30B-A3B-Instruct 45% Imported 2025-09-11
19 Kodu-v1 + Claude-3.5 Sonnet (20241022) 44.67% Imported 2025-09-11
20 CodeFuse-CGM 44% Imported 2025-09-11
21 CodeStory Aide + Mixed Models 43% Imported 2025-09-11
22 Lingxi 42.67% Imported 2025-09-11
23 OpenHands + CodeAct v2.1 (claude-3-5-sonnet-20241022) 41.67% Imported 2025-09-11
24 Codart AI 41.67% Imported 2025-09-11
25 PatchKitty-0.9 + Claude-3.5 Sonnet (20241022) 41.33% Imported 2025-09-11
26 OrcaLoca + Agentless-1.5 + Claude-3.5 Sonnet (20241022) 41% Imported 2025-09-11
27 Composio SWE-Kit (2024-10-30) 41% Imported 2025-09-11
28 Agentless-1.5 + Claude-3.5 Sonnet (20241022) 40.67% Imported 2025-09-11
29 OpenCSG Starship Agentic Coder + GPT 4 (0806) 39.67% Imported 2025-09-11
30 Bytedance MarsCode Agent 39.33% Imported 2025-09-11
31 Moatless Tools + Claude 3.5 Sonnet (20241022) 39% Imported 2025-09-11
32 Moatless Tools + Claude 3.5 Sonnet (20241022) 38.33% Imported 2025-09-11
33 Honeycomb 38.33% Imported 2025-09-11
34 AbanteAI MentatBot + GPT 4o (2024-05-13) 38% Imported 2025-09-11
35 Patched.Codes Patchwork 37% Imported 2025-09-11
36 KGCompass + DeepSeek V3 36.67% Imported 2025-09-11
37 AppMap Navie v2 36% Imported 2025-09-11
38 CodeFuse-AAIS 35.67% Imported 2025-09-11
39 Gru(2024-08-11) 35.67% Imported 2025-09-11
40 Isoform 35% Imported 2025-09-11
41 SuperCoder2.0 34% Imported 2025-09-11
42 Bytedance MarsCode Agent + GPT 4o (2024-05-13) 34% Imported 2025-09-11
43 Alibaba Lingma Agent 33% Imported 2025-09-11
44 Agentless Lite + O3 Mini (20250214) 32.33% Imported 2025-09-11
45 Agentless-1.5 + GPT 4o (2024-05-13) 32% Imported 2025-09-11
46 Factory Code Droid 31.33% Imported 2025-09-11
47 CodeShellTester + GPT 4o (2024-05-13) 31.33% Imported 2025-09-11
48 Moatless Tools + Deepseek V3 30.67% Imported 2025-09-11
49 AutoCodeRover (v20240620) + GPT 4o (2024-05-13) 30.67% Imported 2025-09-11
50 Aegis - o3-mini_1.0 30.33% Imported 2025-09-11
51 AIGCode Infant-Coder(2024-08-30) 30% Imported 2025-09-11
52 Kortix AI (claude-3-5-sonnet-20241022) 30% Imported 2025-09-11
53 Amazon Q Developer Agent (v20240719-dev) 29.67% Imported 2025-09-11
54 Agentless + RepoGraph + GPT-4o 29.67% Imported 2025-09-11
55 CodeR + GPT 4 (1106) 28.33% Imported 2025-09-11
56 reproducedRG 28% Imported 2025-09-11
57 SIMA + GPT 4o (2024-05-13) 27.67% Imported 2025-09-11
58 MASAI + GPT 4o (2024-05-13) 27.33% Imported 2025-09-11
59 Agentless + GPT 4o (2024-05-13) 27.33% Imported 2025-09-11
60 Moatless Tools + Claude 3.5 Sonnet 26.67% Imported 2025-09-11
61 OpenHands + CodeAct v1.8 26.67% Imported 2025-09-11
62 IBM Research Agent-101 26.67% Imported 2025-09-11
63 Aider + GPT 4o & Claude 3 Opus 26.33% Imported 2025-09-11
64 HyperAgent 25.33% Imported 2025-09-11
65 SWE-Fixer (Qwen2.5-7b retriever + Qwen2.5-72b editor) 24.67% Imported 2025-09-11
66 Moatless Tools + GPT 4o (2024-05-13) 24.67% Imported 2025-09-11
67 IBM AI Agent SWE-1.0 (with open LLMs) 23.67% Imported 2025-09-11
68 OpenCSG StarShip CodeGenAgent + GPT 4 (0613) 23.67% Imported 2025-09-11
69 SWE-Fixer (Qwen2.5-7b retriever + Qwen2.5-72b editor) 20241128 23.33% Imported 2025-09-11
70 SWE-agent + Claude 3.5 Sonnet 23% Imported 2025-09-11
71 AppMap Navie + GPT 4o (2024-05-13) 21.67% Imported 2025-09-11
72 Bytedance AutoSE (based on SWE-Agent) + GPT4/GPT4o Mixed (20240828) 21.67% Imported 2025-09-11
73 Amazon Q Developer Agent (v20240430-dev) 20.33% Imported 2025-09-11
74 AutoCodeRover (v20240408) + GPT 4 (0125) 19% Imported 2025-09-11
75 SWE-agent + GPT 4o (2024-05-13) 18.33% Imported 2025-09-11
76 SWE-agent + GPT 4 (1106) 18% Imported 2025-09-11
77 MCTS-Refine-7B 16.33% Imported 2025-09-11
78 SWE-agent + Claude 3 Opus 11.67% Imported 2025-09-11
79 RAG + Claude 3 Opus 4.33% Imported 2025-09-11
80 RAG + Claude 2 3% Imported 2025-09-11
81 RAG + GPT 4 (1106) 2.67% Imported 2025-09-11
82 RAG + SWE-Llama 7B 1.33% Imported 2025-09-11
83 RAG + SWE-Llama 13B 1% Imported 2025-09-11
84 RAG + ChatGPT 3.5 0.33% Imported 2025-09-11