Claude Sonnet 4.5

Claude / Anthropic

111scores
84benchmarks
$3 / $15 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-4.5-sonnet-20250929, anthropic-claude-sonnet-4.5, anthropic/claude-4.5-sonnet-20250929, anthropic/claude-sonnet-4.5, claude-4.5-sonnet-20250929, claude-sonnet-4.5

Benchmark Results

Benchmark Category Rank Score Sampled
ARC-AGI-1 Agentic 47 63.67 2026-05-05
ARC-AGI-1 Agentic 67 48.33 2026-05-05
ARC-AGI-1 Agentic 69 46.50 2026-05-05
ARC-AGI-1 Agentic 91 31 2026-05-05
ARC-AGI-1 Agentic 102 25.50 2026-05-05
ARC-AGI-2 Agentic 46 13.61 2026-05-05
ARC-AGI-2 Agentic 56 6.94 2026-05-05
ARC-AGI-2 Agentic 57 6.94 2026-05-05
ARC-AGI-2 Agentic 63 5.83 2026-05-05
ARC-AGI-2 Agentic 77 3.75 2026-05-05
Berkeley Function-Calling Leaderboard Agentic 2 73.24% 2026-05-27
Berkeley Function-Calling Leaderboard Agentic 89 24.9% 2026-05-27
EnterpriseOps-Gym Agentic 7 30.5% 2026-05-05
Gert Labs Rankings Agentic 13 0.57 2026-05-11
LLM-WikiRace Agentic 13 43.30 2026-05-06
MCP Atlas Agentic 12 59.50 2026-05-06
MCPMark Agentic 9 0.32 2026-05-06
MultiChallenge Agentic 12 55.32 2026-05-06
OSWorld Agentic 19 62.88% 2026-05-27
OSWorld Agentic 25 58.08% 2026-05-27
OSWorld Agentic 50 42.88% 2026-05-27
PinchBench Agentic 11 0.89 2026-05-06
Poker Agent Agentic 8 1055.504% 2025-12-23
RuneBench Agentic 11 3.20 2026-05-05
UAVBench Agentic 28 58.40 2026-05-06
Vending-Bench 2 Agentic 17 3838.74 2026-05-28
AgentBench FC Agents 6 58.90 2026-05-06
AgentBench FC Agents 7 58.30 2026-05-06
OpenUGI Alignment 297 42.39 2026-05-06
OpenUGI Alignment 848 29.44 2026-05-06
scBench Biology 14 33.16% 2026-05-27
SpatialBench Biology 12 41.51% 2026-05-27
ABC-Bench Coding 1 63.2% +/- 1.9 2026-05-27
Arena AI Code Coding 40 1386 2026-05-06
ContextBench Coding 1 53 2026-05-06
IOI Coding 17 18.334% 2026-05-26
LiveCodeBench Coding 56 72.996% 2026-05-28
SWE-bench Verified Coding 30 70% 2026-05-28
Terminal-Bench 2.0 Coding 29 41.573% 2026-05-28
Vibe Code Bench v1.1 Coding 21 22.621% 2026-05-28
VibeCodingBench Coding 6 88.56 2026-05-06
RP-Bench Creative 4 1541.10 2026-05-06
RP-Bench Creative 10 1497.30 2026-05-06
RP-Bench Creative 20 4.37 2026-05-06
SecCodeBench Cybersecurity 13 56.83% 2026-05-28
Arena AI Document Document AI 12 1450 2026-05-06
GSMA Open Telco Leaderboard Domain 16 66.04 2026-05-06
IslamicLegalBench Domain 3 65.63 2026-05-06
SAGE Education 36 36.065% 2026-05-28
SAGE Education 40 32.88% 2026-05-28
TutorBench Education 16 49 2026-05-06
TutorBench Education 21 45.70 2026-05-06
From Perception to Action Embodied AI 4 13.8% 2026-05-28
Vectara HHEM Hallucination Leaderboard Factuality 75 88 2026-05-06
CorpFin v2 Finance 31 61.966% 2026-05-28
CorpFin v2 Finance 43 60.8% 2026-05-28
Finance Agent v1.1 Finance 16 54.5% 2026-05-04
FinChain Finance 2 58.22 ChainEval 2026-05-28
MortgageTax Finance 31 63.99% 2026-05-28
PRBench Finance Finance 12 43.79 2026-05-06
TaxBench Finance 15 8.03% mean pass^5 2026-05-27
TaxEval v2 Finance 32 73.303% 2026-05-28
MageBench Season 1 Game 20 1589 rating / 10 games 2026-05-28
ALL Bench LLM General Knowledge 22 30.42 2026-05-06
BenchLM General Knowledge 37 66 2026-05-06
MedCode Healthcare 20 44.134% 2026-05-28
MedCode Healthcare 26 40.569% 2026-05-28
MedQA Healthcare 15 94.708% 2026-04-16
MedScribe Healthcare 9 84.515% 2026-05-28
MedScribe Healthcare 11 84.101% 2026-05-28
PlaceboBench Healthcare 3 62.3188 2026-05-27
HUMAINE Human Preference 27 3.49 2026-05-06
GPQA Diamond Intelligence 35 81.633% 2026-05-28
MathVision Intelligence 27 71.10 2026-05-06
MMLU Pro Intelligence 14 87.357% 2026-05-28
MMMU Pro Intelligence 29 79.306% 2026-05-28
AraGen v3 Language 6 78.17 2026-05-06
Seneca-TRBench Language 5 88.78 2026-05-06
CaseLaw v2 Legal 19 62.165% 2026-05-04
LegalBench Legal 20 84.084% 2026-05-28
Professional Reasoning Bench - Legal Legal 13 40.84 2026-05-06
ConStory-Bench Long Context 4 CED 0.52 2026-05-28
AIME Math 30 88.19% 2026-04-16
MGSM Math 4 94.327% 2026-01-09
ProofBench Math 15 19% 2026-05-28
FrontierMath 2025-02-28 Private Mathematics 10 13.49 2026-05-06
OTIS Mock AIME 2024-2025 Mathematics 13 77.78 2026-05-06
Medmarks Medical 6 0.49977366098330706 2026-05-27
Medmarks Medical 4 0.6257561642171057 2026-05-27
MedSafe-Dx Medical 6 87.2 2026-05-27
ALL Bench Multimodal Multimodal 17 30.89 2026-05-06
Design Arena Multimodal 32 1242 2026-05-06
Design Arena Multimodal 33 1240 2026-05-06
MMMU-Pro Multimodal 23 68.90 2026-05-06
Visual-Language Understanding Multimodal 9 48.75 2026-05-06
Visual-Language Understanding Multimodal 26 45 2026-05-06
VTB Multimodal 10 6.20 2026-05-06
VTB Multimodal 11 5.60 2026-05-06
CAIS Text Capabilities Index Reasoning 18 25.4 2026-05-27
EnigmaEval Reasoning 13 6.00 2026-05-06
EnigmaEval Reasoning 20 3.38 2026-05-06
Humanity's Last Exam (Text Only) Reasoning 20 14.09 2026-05-06
Humanity's Last Exam (Text Only) Reasoning 34 7.65 2026-05-06
MultiNRC Reasoning 16 35.83 2026-05-06
MultiNRC Reasoning 21 28.15 2026-05-06
CAIS Risk Index Safety 2 34.1 2026-05-27
InvisibleBench Safety 3 0.04 2026-05-06
SciPredict Science 2 22.55 2026-05-06
IDE-Bench Software Engineering 1 87.5 2026-05-27
LiveSQLBench Text to SQL 12 30.46 2026-05-06
CAIS Vision Capabilities Index Vision 22 46.2 2026-05-27