GPT-4.5

GPT / OpenAI

30scores
30benchmarks
$75 / $150 per 1M tokenscost in/out

Metadata

GPT Closed/API

Benchmark Results

Benchmark Category Rank Score Sampled
ARC-AGI-1 Agentic 128 10.30 2026-05-05
ARC-AGI-2 Agentic 124 0.80 2026-05-05
TextClass Benchmark Classification 7 1767.86 2026-05-06
AIRTBench Cybersecurity 2 36.89 2026-05-06
Spider Data 5 85.30 2026-05-06
Open FinLLM Leaderboard Finance 3 43.403043% 2026-05-27
Arena-Hard Generalization 12 50.0% 2026-05-27
HELM AIR-Bench Generalization 30 0.741482 2026-05-28
HELM Safety Generalization 10 0.964672 2026-05-28
MMLU Medical Genetics Healthcare 1 92.0% 2026-05-27
MMLU Professional Medicine Healthcare 1 93.75% 2026-05-27
MultiMedQA Healthcare 1 82.405833% 2026-05-27
Multi-IF Instruction Following 15 0.71 2026-05-06
Artificial Analysis Intelligence Index Intelligence 243 19.96 2026-05-11
MathVision Intelligence 65 47.30 2026-05-06
SimpleQA Intelligence 1 62.5% 2026-05-27
HindiGen v1 Language 30 15.46 2026-05-06
OpenAI-MRCR: 2 needle 128k Long Context 6 0.39 2026-05-06
CharXiv-D Multimodal 3 0.90 2026-05-06
CharXiv-R Multimodal 28 0.55 2026-05-06
MMSI-Bench Multimodal 6 40.3% 2026-05-28
Video SimpleQA Multimodal 4 54.10 2026-05-06
Visual-Language Understanding Multimodal 34 42.11 2026-05-06
EnigmaEval Reasoning 23 3.18 2026-05-06
Graphwalks BFS <128k Reasoning 6 0.72 2026-05-06
Graphwalks parents <128k Reasoning 4 0.73 2026-05-06
Humanity's Last Exam (Text Only) Reasoning 39 5.80 2026-05-06
LingOly-TOO Reasoning 10 0.25 2026-05-06
ComplexFuncBench Tool Use 3 0.63 2026-05-06
COLLIE Writing 4 0.72 2026-05-06