Claude Opus 4.5

Claude / Anthropic

121scores
88benchmarks
$5 / $25 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-4.5-opus-20251124, anthropic-claude-opus-4.5, anthropic/claude-4.5-opus-20251124, anthropic/claude-opus-4.5, claude-4.5-opus-20251124, claude-opus-4.5

Benchmark Results

Benchmark Category Rank Score Sampled
ALFWorld Agentic 1 1.0 2026-05-27
APEX-Agents Agentic 15 34.80 2026-05-06
Berkeley Function-Calling Leaderboard Agentic 1 77.47% 2026-05-27
Berkeley Function-Calling Leaderboard Agentic 57 33.47% 2026-05-27
CAR-bench Agentic 4 0.52 2026-05-06
EnterpriseOps-Gym Agentic 3 37% 2026-05-05
Gert Labs Rankings Agentic 7 0.63 2026-05-11
LLM-WikiRace Agentic 6 56 2026-05-06
MCP Atlas Agentic 6 69.80 2026-05-06
MCPMark Agentic 7 0.42 2026-05-06
MultiChallenge Agentic 10 58.97 2026-05-06
PinchBench Agentic 16 0.87 2026-05-06
Poker Agent Agentic 11 1033.379% 2025-12-23
RuneBench Agentic 9 4.10 2026-05-05
Tau2-Bench Telecom Agentic 59 89.5% 2026-05-11
Tau2-Bench Telecom Agentic 78 86.3% 2026-05-11
Terminal-Bench Hard Agentic 15 47% 2026-05-11
Terminal-Bench Hard Agentic 35 40.9% 2026-05-11
Vending-Bench 2 Agentic 13 4967.06 2026-05-28
OpenUGI Alignment 294 42.64 2026-05-06
OpenUGI Alignment 600 34.23 2026-05-06
scBench Biology 10 47.18% 2026-05-27
SpatialBench Biology 11 42.77% 2026-05-27
Arena AI Code Coding 12 1467 2026-05-06
HoudiniVexBench Coding 1 0.51 2026-05-06
IOI Coding 11 23.584% 2026-05-26
IOI Coding 15 20.25% 2026-05-26
LiveCodeBench Coding 28 83.67% 2026-05-28
LiveCodeBench Coding 54 75.034% 2026-05-28
LMArena WebDev Arena Coding 12 1467.21 2026-05-06
SciCode Coding 23 49.5% 2026-05-11
SciCode Coding 28 47% 2026-05-11
SWE-bench Verified Coding 11 76.4% 2026-05-28
Terminal-Bench 2.0 Coding 10 58.427% 2026-05-28
Terminal-Bench 2.0 Coding 16 53.933% 2026-05-28
Vibe Code Bench v1.1 Coding 23 20.63% 2026-05-28
VibeCodingBench Coding 1 89.15 2026-05-06
Arena AI Document Document AI 9 1470 2026-05-06
GSMA Open Telco Leaderboard Domain 8 69.64 2026-05-06
IB-bench Domain Specific 1 30.50 2026-05-06
SAGE Education 4 52.092% 2026-05-28
SAGE Education 18 45.002% 2026-05-28
TutorBench Education 11 51.20 2026-05-06
TutorBench Education 14 49.82 2026-05-06
From Perception to Action Embodied AI 3 15.6% 2026-05-28
Vectara HHEM Hallucination Leaderboard Factuality 66 89.10 2026-05-06
CorpFin v2 Finance 19 65.074% 2026-05-28
CorpFin v2 Finance 34 61.305% 2026-05-28
Finance Agent v1.1 Finance 8 58.81% 2026-05-04
MortgageTax Finance 9 68.68% 2026-05-28
MortgageTax Finance 17 67.686% 2026-05-28
PRBench Finance Finance 8 46.16 2026-05-06
TaxEval v2 Finance 14 74.856% 2026-05-28
TaxEval v2 Finance 21 74.325% 2026-05-28
BenchLM General Knowledge 23 77 2026-05-06
WeirdML Generalization 3 63.70 2026-05-06
MedCode Healthcare 12 49.156% 2026-05-28
MedCode Healthcare 19 45.174% 2026-05-28
MedQA Healthcare 10 95.875% 2026-04-16
MedQA Healthcare 24 93.158% 2026-04-16
MedScribe Healthcare 7 85.321% 2026-05-28
MedScribe Healthcare 13 83.246% 2026-05-28
Omi SOAP Note Safety Benchmark Healthcare 5 4.54 2026-04-21
HUMAINE Human Preference 44 3.25 2026-05-06
AIIQ Composite IQ Intelligence 10 123 2026-05-12
Artificial Analysis Intelligence Index Intelligence 23 49.73 2026-05-11
Artificial Analysis Intelligence Index Intelligence 53 43.09 2026-05-11
GPQA Diamond Intelligence 22 85.859% 2026-05-28
GPQA Diamond Intelligence 43 79.546% 2026-05-28
Humanity's Last Exam Intelligence 29 28.4% 2026-05-11
Humanity's Last Exam Intelligence 113 12.9% 2026-05-11
MathVision Intelligence 20 77.10 2026-05-06
MMLU Pro Intelligence 17 87.26% 2026-05-28
MMLU Pro Intelligence 33 85.59% 2026-05-28
MMLU-Pro Intelligence 2 89.5% 2026-05-11
MMLU-Pro Intelligence 5 88.9% 2026-05-11
MMMU Pro Intelligence 19 82.948% 2026-05-28
MMMU Pro Intelligence 24 81.098% 2026-05-28
AraGen v3 Language 5 80.29 2026-05-06
CaseLaw v2 Legal 18 62.594% 2026-05-04
LegalBench Legal 13 84.604% 2026-05-28
LegalBench Legal 35 82.837% 2026-05-28
Professional Reasoning Bench - Legal Legal 9 44.21 2026-05-06
Fiction.LiveBench Long Context 17 37.50 2026-05-06
AIME Math 12 95.417% 2026-04-16
AIME Math 50 76.875% 2026-04-16
AIME 2025 Math 20 91.3% 2026-05-11
AIME 2025 Math 112 62.7% 2026-05-11
MGSM Math 1 95.2% 2026-01-09
MGSM Math 2 94.764% 2026-01-09
ProofBench Math 8 36% 2026-05-28
FrontierMath 2025-02-28 Private Mathematics 6 20.69 2026-05-06
FrontierMath Tier 4 2025-07-01 Private Mathematics 6 4.17 2026-05-06
OTIS Mock AIME 2024-2025 Mathematics 6 86.11 2026-05-06
Medical Chronology LLM Benchmark Medical 3 0.91 2026-05-06
Design Arena Multimodal 14 1300 2026-05-06
MMMU-Pro Multimodal 18 73.90 2026-05-06
Visual-Language Understanding Multimodal 19 46.43 2026-05-06
Visual-Language Understanding Multimodal 24 45.32 2026-05-06
VPCT Multimodal 8 40 2026-05-06
Artificial Analysis Openness Index Openness 195 11.11 2026-05-11
Artificial Analysis Openness Index Openness 196 11.11 2026-05-11
ARC-AGI v2 Reasoning 9 0.38 2026-05-06
CAIS Text Capabilities Index Reasoning 8 36.6 2026-05-27
EnigmaEval Reasoning 6 11.91 2026-05-06
EnigmaEval Reasoning 16 4.65 2026-05-06
GPQA Diamond Reasoning 39 86.6% 2026-05-11
GPQA Diamond Reasoning 95 81% 2026-05-11
Humanity's Last Exam (Text Only) Reasoning 8 26.32 2026-05-06
Humanity's Last Exam (Text Only) Reasoning 20 13.90 2026-05-06
MultiNRC Reasoning 7 48.63 2026-05-06
MultiNRC Reasoning 12 41.23 2026-05-06
SimpleBench Reasoning 3 62 2026-05-06
CAIS Risk Index Safety 3 34.7 2026-05-27
CritPt Science 36 4.6% 2026-05-11
CritPt Science 124 0.3% 2026-05-11
GSO-Bench Science 2 26.50 2026-05-06
SciPredict Science 1 23.05 2026-05-06
IDE-Bench Software Engineering 3 83.75 2026-05-27
CAIS Vision Capabilities Index Vision 23 44.9 2026-05-27
Lech Mazur Writing Writing 6 8.54 2026-05-06