GSM-8K (CoT)
Grade School Math 8K with Chain-of-Thought prompting, featuring 8.5K high-quality linguistically diverse grade school math word problems requiring multi-step reasoning and elementary arithmetic operations.
2rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Normalized Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Llama 3.1 70B Instruct | 0.95 | Llama 3.1 70B Instruct meta-llama-llama-3.1-70b-instruct | Self-reported | 2026-05-06 |
| 2 | Llama 3.1 8B Instruct | 0.84 | Llama 3.1 8B Instruct meta-llama-llama-3.1-8b-instruct | Self-reported | 2026-05-06 |
No matching rows.