Grok 3
Grok / xAI
29scores
29benchmarks
$3 / $15 per 1M tokenscost in/out
Metadata
Grok Closed/API
Aliases: grok-3, x-ai-grok-3, x-ai/grok-3
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| APEX-Agents | Agentic | 37 | 7.30 | 2026-05-06 |
| ARC-AGI-1 | Agentic | 134 | 5.50 | 2026-05-05 |
| ARC-AGI-2 | Agentic | 139 | 0 | 2026-05-05 |
| Tau2-Bench Telecom | Agentic | 180 | 48.8% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 198 | 11.4% | 2026-05-11 |
| OpenUGI | Alignment | 10 | 63.47 | 2026-05-06 |
| LiveCodeBench | Coding | 85 | 52.901% | 2026-05-28 |
| SciCode | Coding | 157 | 36.8% | 2026-05-11 |
| K-12EduBench | Education | 16 | 63.91 | 2026-05-27 |
| CorpFin v2 | Finance | 52 | 59.713% | 2026-05-28 |
| FinanceArena | Finance | 7 | 44.6 | 2026-05-27 |
| TaxEval v2 | Finance | 4 | 75.879% | 2026-05-28 |
| MedQA | Healthcare | 63 | 83.85% | 2026-04-16 |
| HUMAINE | Human Preference | 11 | 3.67 | 2026-05-06 |
| AIIQ Composite IQ | Intelligence | 39 | 85 | 2026-05-12 |
| Artificial Analysis Intelligence Index | Intelligence | 191 | 25.17 | 2026-05-11 |
| GPQA Diamond | Intelligence | 58 | 74.242% | 2026-05-28 |
| Humanity's Last Exam | Intelligence | 291 | 5.1% | 2026-05-11 |
| MMLU Pro | Intelligence | 63 | 79.949% | 2026-05-28 |
| MMLU-Pro | Intelligence | 117 | 79.9% | 2026-05-11 |
| LegalBench | Legal | 38 | 82.595% | 2026-05-28 |
| AIME | Math | 58 | 58.75% | 2026-04-16 |
| AIME 2025 | Math | 120 | 58% | 2026-05-11 |
| IneqMath | Math | 34 | 3.50 | 2026-05-06 |
| MATH 500 | Math | 27 | 89.8% | 2026-01-09 |
| MGSM | Math | 28 | 91.346% | 2026-01-09 |
| Design Arena | Multimodal | 92 | 1111 | 2026-05-06 |
| GPQA Diamond | Reasoning | 212 | 69.3% | 2026-05-11 |
| CritPt | Science | 241 | 0% | 2026-05-11 |
No matching rows.