LeanDojo Benchmark
LeanDojo Benchmark: Measures mathematical reasoning, symbolic problem solving, proof construction, or competition-style problem solving.
4rows
novel_premises_pass_at_1primary metric
2026-05-27sampled
Metadata
Metrics
Novel-premises Pass@1, Random-split Pass@1
| Rank | Subject | Novel-premises Pass@1 | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | ReProver (ours) | 26.3% | — | Imported | 2026-05-27 |
| 2 | w/o retrieval | 23.2% | — | Imported | 2026-05-27 |
| 3 | GPT-4 | 7.4% | GPT-4 openai-gpt-4 | Imported | 2026-05-27 |
| 4 | tidy | 5.3% | — | Imported | 2026-05-27 |
No matching rows.