Claude Opus 4

Claude / Anthropic

37scores
28benchmarks
$15 / $75 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-4-opus-20250522, anthropic-claude-opus-4, anthropic/claude-4-opus-20250522, anthropic/claude-opus-4, claude-4-opus-20250522, claude-opus-4

Benchmark Results

Benchmark Category Rank Score Sampled
ARC-AGI-1 Agentic 84 35.67 2026-05-05
ARC-AGI-1 Agentic 92 30.67 2026-05-05
ARC-AGI-1 Agentic 99 27 2026-05-05
ARC-AGI-1 Agentic 105 22.50 2026-05-05
ARC-AGI-2 Agentic 53 8.61 2026-05-05
ARC-AGI-2 Agentic 70 4.52 2026-05-05
ARC-AGI-2 Agentic 110 1.27 2026-05-05
ARC-AGI-2 Agentic 138 0 2026-05-05
LiveCodeBench Coding 19 56.60 2026-05-06
LiveCodeBench Coding 22 46.90 2026-05-06
LiveCodeBench Coding 74 62.629% 2026-05-28
GSMA Open Telco Leaderboard Domain 12 68.25 2026-05-06
TutorBench Education 21 45.46 2026-05-06
FinanceArena Finance 6 44.6 2026-05-27
MortgageTax Finance 47 58.586% 2026-05-28
TaxEval v2 Finance 47 71.914% 2026-05-28
Xent Games Game 7 58.35 overall 2026-05-28
MedQA Healthcare 27 92.867% 2026-04-16
HUMAINE Human Preference 25 3.50 2026-05-06
AIIQ Composite IQ Intelligence 32 100 2026-05-12
GPQA Diamond Intelligence 64 71.717% 2026-05-28
MMLU Pro Intelligence 28 86.166% 2026-05-28
MMMU Pro Intelligence 40 73.31% 2026-05-28
AraGen v3 Language 4 80.96 2026-05-06
HindiGen v1 Language 5 74.49 2026-05-06
LegalBench Legal 33 83.071% 2026-05-28
AIME Math 69 41.25% 2026-04-16
IneqMath Math 28 5.50 2026-05-06
MATH 500 Math 24 90.4% 2026-01-09
MGSM Math 8 93.782% 2026-01-09
Design Arena Multimodal 53 1219 2026-05-06
Visual-Language Understanding Multimodal 17 46.96 2026-05-06
Visual-Language Understanding Multimodal 29 43.53 2026-05-06
ARC-AGI v2 Reasoning 13 0.09 2026-05-06
EnigmaEval Reasoning 13 5.57 2026-05-06
EnigmaEval Reasoning 23 3.21 2026-05-06
Humanity's Last Exam (Text Only) Reasoning 26 10.80 2026-05-06