Claude Sonnet 4.6

Claude / Anthropic

129scores
95benchmarks
$3 / $15 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-4.6-sonnet-20260217, anthropic-claude-sonnet-4.6, anthropic/claude-4.6-sonnet-20260217, anthropic/claude-sonnet-4.6, claude-4.6-sonnet-20260217, claude-sonnet-4.6

Benchmark Results

Benchmark Category Rank Score Sampled
ALFWorld Agentic 3 1.0 2026-05-27
APEX-Agents-AA Agentic 6 28% 2026-05-11
ARC-AGI-1 Agentic 25 86.50 2026-05-05
ARC-AGI-1 Agentic 29 86 2026-05-05
ARC-AGI-2 Agentic 23 60.42 2026-05-05
ARC-AGI-2 Agentic 24 58.33 2026-05-05
AutoBench Agentic 4 3.16 2026-05-06
Claw-Eval-Live Agentic 3 61.9 2026-05-27
EnterpriseOps-Gym Agentic 2 40.4% 2026-05-05
GDPval-AA Agentic 1 1633 2026-05-06
Gert Labs Rankings Agentic 9 0.61 2026-05-11
ITBench-AA Agentic 6 39.8% 2026-05-28
MCP Atlas Agentic 7 69.50 2026-05-06
OSWorld Agentic 10 72.11% 2026-05-27
PinchBench Agentic 13 0.88 2026-05-06
RealDataAgentBench Agentic 3 0.86 2026-04-28
RuneBench Agentic 12 3.20 2026-05-05
Tau2-Bench Telecom Agentic 108 79.5% 2026-05-11
Tau2-Bench Telecom Agentic 111 78.9% 2026-05-11
Tau2-Bench Telecom Agentic 117 75.7% 2026-05-11
Terminal-Bench Hard Agentic 7 53% 2026-05-11
Terminal-Bench Hard Agentic 18 46.2% 2026-05-11
Terminal-Bench Hard Agentic 30 42.4% 2026-05-11
Toolathlon Agentic 4 41% 2026-05-28
Vending-Bench 2 Agentic 4 7204.14 2026-05-28
OpenUGI Alignment 54 53.52 2026-05-06
OpenUGI Alignment 61 52.82 2026-05-06
OpenUGI Alignment 74 51.87 2026-05-06
OpenUGI Alignment 252 44.01 2026-05-06
BioPipelineBench Verified Biology 4 73.5% 2026-05-28
ProteinGym Hard Biology 4 35.4% 2026-05-28
Protocol Troubleshooting (Anthropic Internal) Biology 4 42.4% 2026-05-28
scBench Biology 4 50.4% 2026-05-28
scBench Biology 9 50.26% 2026-05-27
SpatialBench Biology 4 48.7% 2026-05-28
SpatialBench Biology 10 44.23% 2026-05-27
Structural Biology Open-Ended Biology 4 31.3% 2026-05-28
Organic Chemistry (Anthropic Internal) Chemistry 4 53.1% 2026-05-28
Arena AI Code Coding 6 1526 2026-05-06
DeepSWE Coding 4 31.56 2026-05-26
LiveCodeBench Coding 35 82.091% 2026-05-28
LMArena WebDev Arena Coding 6 1526.17 2026-05-06
SciCode Coding 30 46.9% 2026-05-11
SciCode Coding 33 46.8% 2026-05-11
SciCode Coding 48 44.1% 2026-05-11
SWE-bench Verified Coding 9 77.4% 2026-05-28
Terminal-Bench 2.0 Coding 7 59.551% 2026-05-28
Vibe Code Bench v1.1 Coding 9 51.476% 2026-05-28
CyberGym Cybersecurity 4 65.2% 2026-05-28
ExploitBench v8-bench Cybersecurity 7 3.37 points 2026-05-28
ExploitBench v8-bench Cybersecurity 8 3.17 points 2026-05-28
ExploitBench v8-bench Cybersecurity 10 3.37 points 2026-05-15
ExploitBench v8-bench Cybersecurity 11 3.17 points 2026-05-15
Firefox 147 JS Exploitation Cybersecurity 4 0% 2026-05-28
OrgForge-IT Cybersecurity 4 0.800 2026-05-28
Arena AI Document Document AI 5 1500 2026-05-06
GSMA Open Telco Leaderboard Domain 58 44.78 2026-05-06
SAGE Education 16 46.582% 2026-05-28
AA-Omniscience Factuality 5 12.37 2026-05-11
Vectara HHEM Hallucination Leaderboard Factuality 61 89.40 2026-05-06
CorpFin v2 Finance 16 65.307% 2026-05-28
Finance Agent v1.1 Finance 2 63.331% 2026-05-04
Finance Agent v2 Finance 5 51.035% 2026-05-28
MortgageTax Finance 16 67.726% 2026-05-28
Rogo Big Finance Bench Finance 3 59% rubric / 38% final 2026-05-28
TaxBench Finance 12 11.20% mean pass^5 2026-05-27
TaxEval v2 Finance 2 77.106% 2026-05-28
React Native Evals Frontend Development 8 80.6227% overall 2026-05-28
InfiniteBM Chess Game 3 1190.33 Elo / 11 games 2026-05-28
InfiniteBM Coup Game 2 1549.3 Elo / 34 games 2026-05-28
InfiniteBM Coup Game 8 519.02 Elo / 6 games 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 3 1485.1 Elo / 20 games 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 13 1251.34 Elo / 209 games 2026-05-28
InfiniteBM Liar's Dice Game 14 1267.56 Elo / 6613 games 2026-05-28
InfiniteBM Liar's Dice Game 23 1170.63 Elo / 41 games 2026-05-28
InfiniteBM Settlers of Catan Game 2 1805.89 Elo / 24 games 2026-05-28
InfiniteBM Werewolf Game 6 1137.69 Elo / 22 games 2026-05-28
InfiniteBM Werewolf Game 11 889.31 Elo / 19 games 2026-05-28
ALL Bench LLM General Knowledge 20 32.28 2026-05-06
BenchLM General Knowledge 15 83 2026-05-06
HealthBench Professional Healthcare 3 41.7% 2026-05-28
MedQA Healthcare 37 92.058% 2026-04-16
PhysicianBench Healthcare 5 23.0 +/- 2.6 2026-05-27
Artificial Analysis Intelligence Index Intelligence 15 51.72 2026-05-11
Artificial Analysis Intelligence Index Intelligence 46 44.38 2026-05-11
Artificial Analysis Intelligence Index Intelligence 57 42.6 2026-05-11
GPQA Diamond Intelligence 23 85.606% 2026-05-28
Humanity's Last Exam Intelligence 24 30% 2026-05-11
Humanity's Last Exam Intelligence 109 13.2% 2026-05-11
Humanity's Last Exam Intelligence 140 10.8% 2026-05-11
MMLU Pro Intelligence 15 87.341% 2026-05-28
MMMU Pro Intelligence 15 83.584% 2026-05-28
Vals Index Intelligence 5 60.296% 2026-05-28
Vals Multimodal Index Intelligence 5 60.783% 2026-05-28
CaseLaw v2 Legal 14 63.987% 2026-05-04
Harvey Legal Agent Benchmark Legal 2 5.4% 2026-05-28
LegalBench Legal 43 82.12% 2026-05-28
AIME Math 20 92.292% 2026-04-16
ProofBench Math 7 45% 2026-05-28
Global MMLU Multilingual 5 86.1% 2026-05-28
ALL Bench Multimodal Multimodal 16 32.53 2026-05-06
ALL Bench Multimodal Multimodal 7 17.93 2026-05-06
Blueprint-Bench 2 Multimodal 8 0.570 +/- 0.011 2026-05-28
Design Arena Multimodal 8 1331 2026-05-06
IDP Leaderboard Multimodal 8 80.68 2026-05-06
LMArena Vision Arena Multimodal 12 1277.89 2026-05-06
ARC-AGI v2 Reasoning 5 0.58 2026-05-06
CAIS Text Capabilities Index Reasoning 11 32.6 2026-05-27
Context Arena Reasoning 8 70.50 2026-05-06
Context Arena Reasoning 9 70.38 2026-05-06
Context Arena Reasoning 10 69.61 2026-05-06
Context Arena Reasoning 28 46.73 2026-05-06
GPQA Diamond Reasoning 29 87.5% 2026-05-11
GPQA Diamond Reasoning 102 79.9% 2026-05-11
GPQA Diamond Reasoning 103 79.7% 2026-05-11
CAIS Risk Index Safety 5 38.8 2026-05-27
HarmActionsEval Safety 3 2.84 2026-05-06
LiveSecBench Safety 2 85.97 2026-05-27
BioMysteryBench Human-Difficult Science 4 19.1% 2026-05-28
BioMysteryBench Human-Difficult Science 4 19.1% 2026-04-29
BioMysteryBench Human-Solvable Science 4 71.8% 2026-05-28
BioMysteryBench Human-Solvable Science 4 71.8% 2026-04-29
CritPt Science 44 3.1% 2026-05-11
CritPt Science 91 0.9% 2026-05-11
CritPt Science 92 0.9% 2026-05-11
ProgramBench Software Engineering 3 0% 2026-05-05
SWE-PRBench Software Engineering 2 0.152 2026-05-27
Structured Output Benchmark Structured Output 11 85.40 2026-05-06
CAIS Vision Capabilities Index Vision 21 47.7 2026-05-27