PutnamBench

PutnamBench: Measures mathematical reasoning, symbolic problem solving, proof construction, or competition-style problem solving.

37rows
total_solved_with_solutionsprimary metric
2026-05-27sampled

Metadata

Metrics

Total solved with solutions, Lean solved with solutions, Isabelle solved with solutions, Coq solved with solutions

Latest Results

Rows are parsed from PutnamBench results.json. Primary score sums Lean, Isabelle, and Coq solved counts for the with-solution setting.

Rank Subject Total solved with solutions Model Match Provenance Sampled
1 Aleph Prover (Logical Intelligence) 668 solved Imported 2026-05-27
2 Aleph Prover (Logical Intelligence) 637 solved Imported 2026-05-27
3 Seed-Prover 1.5 (ByteDance) 581 solved Imported 2026-05-27
4 Aleph Prover (Logical Intelligence) 500 solved Imported 2026-05-27
5 Hilbert 462 solved Imported 2026-05-27
6 AxProverBase (Axiomatic AI) 365 solved Imported 2026-05-27
7 Seed-Prover (ByteDance) 329 solved Imported 2026-05-27
8 Ax-Prover (Axiomatic AI) 91 solved Imported 2026-05-27
9 Goedel-Prover-V2 86 solved Imported 2026-05-27
10 DeepSeek-Prover-V2 47 solved Imported 2026-05-27
11 GPT-5 (ReAct, 10 turns) 28 solved Imported 2026-05-27
12 DSP+ 23 solved Imported 2026-05-27
13 Bourbaki 14 solved Imported 2026-05-27
14 Kimina-Prover-7B-Distill 10 solved Imported 2026-05-27
15 Self-play Theorem Prover 8 solved Imported 2026-05-27
16 ABEL 7 solved Imported 2026-05-27
17 Goedel-Prover-SFT 7 solved Imported 2026-05-27
18 InternLM2.5-StepProver 6 solved Imported 2026-05-27
19 DSP (GPT-4o) 4 solved Imported 2026-05-27
20 InternLM 7B 4 solved Imported 2026-05-27
21 gemini-2.5-pro-exp-0325 3 solved Imported 2026-05-27
22 GPT-4o 3 solved Imported 2026-05-27
23 Sledgehammer 3 solved Imported 2026-05-27
24 COPRA (GPT-4o) 2 solved Imported 2026-05-27
25 o4-mini-high 2 solved Imported 2026-05-27
26 Deepseek R1 1 solved Imported 2026-05-27
27 gemini-2.0-flash-thinking-121 1 solved Imported 2026-05-27
28 claude-3.7-sonnet 0 solved Imported 2026-05-27
29 CoqHammer 0 solved Imported 2026-05-27
30 DeepSeek-V3-0324 0 solved Imported 2026-05-27
31 GPT-4o-mini 0 solved Imported 2026-05-27
32 Grok-3-mini 0 solved Imported 2026-05-27
33 o3-mini 0 solved Imported 2026-05-27
34 ReProver w/ retrieval 0 solved Imported 2026-05-27
35 ReProver w/o retrieval 0 solved Imported 2026-05-27
36 Tactician (LSH) 0 solved Imported 2026-05-27
37 TIR Conjecturor 0 solved Imported 2026-05-27