FunctionalMATH
A functional variant of the MATH benchmark that tests language models' ability to generalize reasoning patterns across different problem instances, revealing the reasoning gap between static and functional performance.
2rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Normalized Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Gemini 1.5 Pro | 0.65 | — | Self-reported | 2026-05-06 |
| 2 | Gemini 1.5 Flash | 0.54 | — | Self-reported | 2026-05-06 |
No matching rows.