GPT-4o (2024-05-13)
GPT / OpenAI
17scores
17benchmarks
$5 / $15 per 1M tokenscost in/out
Metadata
GPT Closed/API
Aliases: gpt-4o-2024-05-13, openai-gpt-4o-2024-05-13, openai/gpt-4o-2024-05-13
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| Clembench Multimodal v1.6.5 | Agentic | 4 | 69.56 | 2026-05-06 |
| OpenUGI | Alignment | 201 | 45.56 | 2026-05-06 |
| RewardBench | Alignment | 60 | 83.27 | 2026-05-06 |
| BigCodeBench | Coding | 1 | 51.10 | 2026-05-06 |
| SciCode | Coding | 238 | 30.9% | 2026-05-11 |
| MixEval Chat | General Knowledge | 4 | 64.70 | 2026-05-06 |
| Artificial Analysis Intelligence Index | Intelligence | 335 | 14.5 | 2026-05-11 |
| GPQA Diamond | Intelligence | 94 | 50.252% | 2026-05-28 |
| Humanity's Last Exam | Intelligence | 476 | 2.8% | 2026-05-11 |
| MMLU-Pro | Intelligence | 192 | 74% | 2026-05-11 |
| HindiGen v1 | Language | 7 | 73.98 | 2026-05-06 |
| BenchBench | Meta | 1 | 0.98 | 2026-05-06 |
| DROP | Reasoning | 7 | 0.83 | 2026-05-06 |
| GPQA Diamond | Reasoning | 327 | 52.6% | 2026-05-11 |
| ZebraLogic | Reasoning | 21 | 28.20 | 2026-05-06 |
| AI-Secure LLM Trustworthy Leaderboard | Safety | 2 | 0.83 | 2026-05-06 |
| SWE-bench Verified | Software Engineering | 1 | 33.2% | 2024-08-13 |
No matching rows.