GPT-4o-mini (2024-07-18)
GPT / OpenAI
30scores
30benchmarks
$0.15 / $0.6 per 1M tokenscost in/out
Metadata
GPT Closed/API
Aliases: gpt-4o-mini-2024-07-18, openai-gpt-4o-mini-2024-07-18, openai/gpt-4o-mini-2024-07-18
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| ARC-AGI-2 | Agentic | 132 | 0 | 2026-05-05 |
| Clembench Multimodal v1.6.5 | Agentic | 7 | 58.46 | 2026-05-06 |
| RewardBench | Alignment | 71 | 80.07 | 2026-05-06 |
| BigCodeBench | Coding | 20 | 46.10 | 2026-05-06 |
| LiveCodeBench | Coding | 26 | 27.50 | 2026-05-06 |
| LiveCodeBench | Coding | 109 | 26.423% | 2026-05-28 |
| CorpFin v2 | Finance | 92 | 45.455% | 2026-05-28 |
| MortgageTax | Finance | 57 | 54.492% | 2026-05-28 |
| TaxEval v2 | Finance | 99 | 60.548% | 2026-05-28 |
| WeirdML | Generalization | 29 | 11.76 | 2026-05-06 |
| MedQA | Healthcare | 81 | 72.436% | 2026-04-16 |
| GPQA Diamond | Intelligence | 102 | 44.192% | 2026-05-28 |
| MMLU Pro | Intelligence | 103 | 62.735% | 2026-05-28 |
| MMMU Pro | Intelligence | 68 | 56.557% | 2026-05-28 |
| AraGen v3 | Language | 22 | 50.74 | 2026-05-06 |
| HindiGen v1 | Language | 15 | 65.50 | 2026-05-06 |
| PIQA | Language | 1 | 88.70 | 2026-05-06 |
| AIME | Math | 88 | 11.458% | 2026-04-16 |
| MATH 500 | Math | 49 | 72.6% | 2026-01-09 |
| MGSM | Math | 64 | 86.182% | 2026-01-09 |
| OTIS Mock AIME 2024-2025 | Mathematics | 31 | 6.94 | 2026-05-06 |
| BenchBench | Meta | 15 | 0.83 | 2026-05-06 |
| VPCT | Multimodal | 11 | 34 | 2026-05-06 |
| Balrog | Reasoning | 14 | 17.40 | 2026-05-06 |
| DROP | Reasoning | 12 | 0.80 | 2026-05-06 |
| SimpleBench | Reasoning | 27 | 10.70 | 2026-05-06 |
| ZebraLogic | Reasoning | 36 | 20.10 | 2026-05-06 |
| AI-Secure LLM Trustworthy Leaderboard | Safety | 5 | 0.76 | 2026-05-06 |
| X-Risks Leaderboard | Safety | 7 | 15.51 | 2026-05-06 |
| Lech Mazur Writing | Writing | 25 | 6.72 | 2026-05-06 |
No matching rows.