SWE-bench Extra
SWE-bench Extra: Evaluates software-engineering agents on realistic issue resolution, repository navigation, testing, or maintenance workflows.
0rows
scoreprimary metric
—sampled
Metadata
Metrics
Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|
No matching rows.