ProofNet
ProofNet: Measures mathematical reasoning, symbolic problem solving, proof construction, or competition-style problem solving.
6rows
accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
Accuracy, Typecheck Rate
| Rank | Subject | Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Code-davinci-002 (in-context learning) (statement informalization) | 62.3% | — | Imported | 2026-05-27 |
| 2 | Davinci-code-002 (prompt retrieval) (statement autoformalization) | 16.1% | — | Imported | 2026-05-27 |
| 3 | Davinci-code-002 (in-context learning) (statement autoformalization) | 13.4% | — | Imported | 2026-05-27 |
| 4 | proofGPT-6.7B (in-context learning) (statement informalization) | 6.5% | — | Imported | 2026-05-27 |
| 5 | proofGPT-1.3B (in-context learning) (statement informalization) | 4.3% | — | Imported | 2026-05-27 |
| 6 | proofGPT-1.3B (statement autoformalization) | 3.2% | — | Imported | 2026-05-27 |
No matching rows.