RepoBench

RepoBench: Evaluates software-engineering agents on realistic issue resolution, repository navigation, testing, or maintenance workflows.

1rows
scoreprimary metric
2026-05-27sampled

Metadata

Metrics

Score, Normalized Score

Latest Results

Rows are imported from the public ZeroEval/LLM-Stats RepoBench benchmark details JSON endpoint. Source verification and self-report metadata are preserved.

Rank Subject Score Model Match Provenance Sampled
1 Codestral-22B 0.34 Self-reported 2026-05-27