MiniF2F

MiniF2F: Measures mathematical reasoning, symbolic problem solving, proof construction, or competition-style problem solving.

3rows
test_pass_at_1primary metric
2026-05-27sampled

Metadata

Metrics

miniF2F-test Pass@1, miniF2F-test Pass@8, Average proof length (lower is better)

Latest Results

Rows are transcribed from public miniF2F ICLR 2022 Table 3 baseline results. Primary score is miniF2F-test Pass@1.

Rank Subject miniF2F-test Pass@1 Model Match Provenance Sampled
1 Lean GPT-f 24.6% Imported 2026-05-27
2 Lean tidy 18.0% Imported 2026-05-27
3 Metamath GPT-f 1.3% Imported 2026-05-27