Claude Opus 4.6

Claude / Anthropic

203scores
145benchmarks
$5 / $25 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-4.6-opus-20260205, anthropic-claude-opus-4.6, anthropic/claude-4.6-opus-20260205, anthropic/claude-opus-4.6, claude-4.6-opus-20260205, claude-opus-4.6, Opus-4.6 Max, Opus 4.6 Max, Claude Opus 4.6 Max, Claude Opus 4.6 (Max), anthropic/claude-opus-4.6-max

Benchmark Results

Benchmark Category Rank Score Sampled
ALFWorld Agentic 2 1.0 2026-05-27
APEX-Agents Agentic 4 48.40 2026-05-06
APEX-Agents Agentic 8 45.60 2026-05-06
APEX-Agents-AA Agentic 3 33% 2026-05-11
ARC-AGI-1 Agentic 11 94 2026-05-05
ARC-AGI-1 Agentic 14 93 2026-05-05
ARC-AGI-1 Agentic 18 92 2026-05-05
ARC-AGI-1 Agentic 28 86 2026-05-05
ARC-AGI-2 Agentic 14 69.17 2026-05-05
ARC-AGI-2 Agentic 15 68.75 2026-05-05
ARC-AGI-2 Agentic 19 66.25 2026-05-05
ARC-AGI-2 Agentic 21 64.58 2026-05-05
AutoBench Agentic 2 3.24 2026-05-06
AutoLab Agentic 1 0.85 2026-05-06
BrowseComp Agentic 4 83.7% 2026-04-16
CAR-bench Agentic 1 0.58 2026-05-06
Claw-Eval-Live Agentic 1 66.7 2026-05-27
CoWorkBench Agentic 1 68.2% 2026-05-28
EnterpriseOps-Gym Agentic 1 44.6% 2026-05-05
GDPval-AA Agentic 2 1606 2026-05-06
Gert Labs Rankings Agentic 6 0.63 2026-05-11
HiL-Bench Agentic 3 24.33% 2026-05-05
LLM-WikiRace Agentic 4 56.70 2026-05-06
MCP Atlas Agentic 2 75.8% 2026-05-28
MCP Atlas Agentic 2 76.80 2026-05-06
MCP Atlas Agentic 2 75.8% 2026-04-16
MCPMark Agentic 4 56.7% 2026-05-28
MultiChallenge Agentic 12 56.02 2026-05-06
MultiChallenge Agentic 28 37.15 2026-05-06
OSWorld-Verified Agentic 4 72.7% 2026-04-16
PinchBench Agentic 1 0.93 2026-05-06
QwenClawBench Agentic 1 65.5% 2026-05-28
QwenWorldBench Agentic 2 56.1% 2026-05-28
RealDataAgentBench Agentic 5 0.85 2026-04-28
RuneBench Agentic 6 4.40 2026-05-05
Tau2-Bench Telecom Agentic 45 92.1% 2026-05-11
Tau2-Bench Telecom Agentic 86 84.8% 2026-05-11
Terminal-Bench Hard Agentic 14 48.5% 2026-05-11
Terminal-Bench Hard Agentic 17 46.2% 2026-05-11
TERMS-Bench Agentic 1 69.4% SE+ 2026-05-28
Toolathlon Agentic 3 56.8% 2026-05-28
Vending-Bench 2 Agentic 2 8017.59 2026-05-28
WildClawBench Agentic 1 51.60 2026-05-06
OpenUGI Alignment 13 60.41 2026-05-06
OpenUGI Alignment 48 54.23 2026-05-06
OpenUGI Alignment 69 52.20 2026-05-06
OpenUGI Alignment 79 51.57 2026-05-06
scBench Biology 7 52.65% 2026-05-27
SpatialBench Biology 4 52.83% 2026-05-27
Arena AI Code Coding 3 1548 2026-05-06
Arena AI Code Coding 4 1543 2026-05-06
BLXBench Coding 10 71.10 2026-05-06
Claw-Eval Coding 1 70.4% 2026-05-28
DeepSWE Coding 6 27.06 2026-05-26
FrontierSWE Coding 3 4.9 avg rank 2026-05-28
Kernel Bench L3 Coding 1 2.63/98% 2026-05-28
LiveCodeBench Coding 4 88.8% 2026-05-28
LiveCodeBench Coding 21 84.676% 2026-05-28
LMArena WebDev Arena Coding 3 1548.84 2026-05-06
LMArena WebDev Arena Coding 4 1544.36 2026-05-06
NL2Repo Coding 1 47.6% 2026-05-28
QwenSVG Coding 3 1541 2026-05-28
QwenWebDev Coding 1 1617 2026-05-28
SciCode Coding 3 51.9% 2026-05-28
SciCode Coding 12 51.9% 2026-05-11
SciCode Coding 38 45.7% 2026-05-11
SWE-bench Verified Coding 6 78.2% 2026-05-28
Terminal-Bench 2.0 Coding 11 58.427% 2026-05-28
Terminal-Bench 2.0 Coding 4 65.4% 2026-05-28
Terminal-Bench 2.0 Coding 5 65.4% 2026-04-16
Vibe Code Bench v1.1 Coding 6 57.573% 2026-05-28
Vibe Code Bench v1.1 Coding 8 53.498% 2026-05-28
RP-Bench Creative 1 1705.70 2026-05-06
CyberGym Cybersecurity 3 0.74 2026-05-06
CyberGym Cybersecurity 2 73.8% 2026-04-16
OrgForge-IT Cybersecurity 1 1.000 2026-05-28
SecCodeBench Cybersecurity 2 64.9% 2026-05-28
Arena AI Document Document AI 1 1526 2026-05-06
Arena AI Document Document AI 2 1520 2026-05-06
GSMA Open Telco Leaderboard Domain 5 73.30 2026-05-06
SAGE Education 6 51.575% 2026-05-28
TutorBench Education 2 53.68 2026-05-06
TutorBench Education 2 53.55 2026-05-06
Vectara HHEM Hallucination Leaderboard Factuality 78 87.80 2026-05-06
CorpFin v2 Finance 5 67.016% 2026-05-28
Finance Agent v1.1 Finance 5 60.046% 2026-05-04
Finance Agent v1.1 Finance 3 60.1% 2026-04-16
MortgageTax Finance 10 68.522% 2026-05-28
PRBench Finance Finance 1 53.28 2026-05-06
TaxBench Finance 4 21.37% mean pass^5 2026-05-27
TaxEval v2 Finance 3 75.961% 2026-05-28
React Native Evals Frontend Development 6 84.1026% overall 2026-05-28
MageBench Season 1 Game 1 1747 rating / 16 games 2026-05-28
ALL Bench LLM General Knowledge 1 64.87 2026-05-06
BenchLM General Knowledge 10 87 2026-05-06
MAXIFE General Knowledge 6 81.3% 2026-05-28
MMLU-ProX General Knowledge 2 86.1% 2026-05-28
MMLU-Redux General Knowledge 2 95.2% 2026-05-28
NOVA-63 General Knowledge 1 59.1% 2026-05-28
LMArena Text Arena Generalization 1 1500.24 2026-05-06
LMArena Text Arena Generalization 2 1496.47 2026-05-06
MedCode Healthcare 13 49.129% 2026-05-28
MedCode Healthcare 15 48.244% 2026-05-28
MedQA Healthcare 12 95.408% 2026-04-16
MedScribe Healthcare 3 86.738% 2026-05-28
MedScribe Healthcare 4 86.13% 2026-05-28
PhysicianBench Healthcare 2 31.7 +/- 2.3 2026-05-27
PlaceboBench Healthcare 7 36.2319 2026-05-27
HUMAINE Human Preference 28 3.48 2026-05-06
IFBench Instruction Following 6 62.5% 2026-05-28
IFEval Instruction Following 5 91.9% 2026-05-28
AIIQ Composite IQ Intelligence 5 131 2026-05-12
Artificial Analysis Intelligence Index Intelligence 11 52.95 2026-05-11
Artificial Analysis Intelligence Index Intelligence 38 46.46 2026-05-11
GPQA Diamond Intelligence 11 89.646% 2026-05-28
HLE w/ tools Intelligence 3 53% 2026-05-28
Humanity's Last Exam Intelligence 2 40% 2026-05-28
Humanity's Last Exam Intelligence 10 36.7% 2026-05-11
Humanity's Last Exam Intelligence 77 18.6% 2026-05-11
Humanity's Last Exam Intelligence 4 53.3% 2026-04-16
MathVision Intelligence 13 84.60 2026-05-06
MathVision Intelligence 26 71.20 2026-05-06
MMLU Pro Intelligence 7 89.107% 2026-05-28
MMLU-Pro Intelligence 1 89.7% 2026-05-28
MMMU Pro Intelligence 14 83.873% 2026-05-28
OCRBench v2 Intelligence 16 48.40 2026-05-06
OCRBench v2 Intelligence 9 59.80 2026-05-06
SuperGPQA Intelligence 2 72.5% 2026-05-28
CaseLaw v2 Legal 20 62.058% 2026-05-04
Harvey Legal Agent Benchmark Legal 3 4.2% 2026-05-28
LegalBench Legal 8 85.301% 2026-05-28
Professional Reasoning Bench - Legal Legal 1 52.27 2026-05-06
Graphwalks BFS >128k Long Context 2 0.61 2026-05-06
Graphwalks BFS 1M F1 Long Context 4 16.3% 2026-05-28
Graphwalks BFS 1M F1 Long Context 2 41.2% 2026-04-23
Graphwalks BFS 256k F1 Long Context 4 61.1% 2026-05-28
Graphwalks parents >128k Long Context 1 0.95 2026-05-06
Graphwalks Parents 1M F1 Long Context 4 48.6% 2026-05-28
Graphwalks Parents 1M F1 Long Context 1 72% 2026-04-23
Graphwalks Parents 256k F1 Long Context 2 95.4% 2026-05-28
MRCR v2 (8-needle) Long Context 1 0.93 2026-05-06
MRCR-v2 128k Long Context 3 84% 2026-05-28
AIME Math 8 95.625% 2026-04-16
ProofBench Math 5 50% 2026-05-28
HMMT February 2026 Mathematics 2 96.2% 2026-05-28
IMO-AnswerBench Mathematics 6 75.3% 2026-05-28
MathArena Apex Mathematics 3 34.5% 2026-05-28
Medical Chronology LLM Benchmark Medical 1 0.92 2026-05-06
INCLUDE Multilingual 1 87.4% 2026-05-28
MMMLU Multilingual 1 90.6% 2026-05-28
MMMLU Multilingual 3 91.1% 2026-04-16
ALL Bench Multimodal Multimodal 2 63.16 2026-05-06
ALL Bench Multimodal Multimodal 7 8.51 2026-05-06
ALL Bench Multimodal Multimodal 1 52.61 2026-05-06
CharXiv-R Multimodal 16 0.77 2026-05-06
CharXiv-R Multimodal 3 84.7% 2026-04-16
Design Arena Multimodal 2 1345 2026-05-06
Design Arena Multimodal 3 1343 2026-05-06
FigQA Multimodal 2 0.78 2026-05-06
IDP Leaderboard Multimodal 9 80.37 2026-05-06
LMArena Vision Arena Multimodal 1 1317.74 2026-05-06
LMArena Vision Arena Multimodal 4 1311.60 2026-05-06
Visual-Language Understanding Multimodal 20 46.07 2026-05-06
Visual-Language Understanding Multimodal 21 45.48 2026-05-06
VTB Multimodal 2 27.52 2026-05-06
ARC-AGI v2 Reasoning 4 0.69 2026-05-06
CAIS Text Capabilities Index Reasoning 6 44.0 2026-05-27
Context Arena Reasoning 5 73.06 2026-05-06
Context Arena Reasoning 6 72.43 2026-05-06
Context Arena Reasoning 7 72.26 2026-05-06
Context Arena Reasoning 26 48.19 2026-05-06
EnigmaEval Reasoning 10 7.60 2026-05-06
EnigmaEval Reasoning 12 6.84 2026-05-06
FINAL Bench Metacognitive Reasoning 5 76.17 2026-05-06
Global PIQA Reasoning 2 91.2% 2026-05-28
GPQA Diamond Reasoning 2 91.3% 2026-05-28
GPQA Diamond Reasoning 17 89.6% 2026-05-11
GPQA Diamond Reasoning 66 84% 2026-05-11
GPQA Diamond Reasoning 5 91.3% 2026-04-16
Humanity's Last Exam (Text Only) Reasoning 4 36.24 2026-05-06
Humanity's Last Exam (Text Only) Reasoning 12 19.37 2026-05-06
MultiNRC Reasoning 3 57.06 2026-05-06
MultiNRC Reasoning 7 48.34 2026-05-06
CAIS Risk Index Safety 7 40.7 2026-05-27
BioMysteryBench Human-Difficult Science 3 23.5% 2026-04-29
BioMysteryBench Human-Solvable Science 3 77.4% 2026-04-29
CritPt Science 2 12.6% 2026-05-28
CritPt Science 11 12.6% 2026-05-11
CritPt Science 53 2.8% 2026-05-11
DeepSearchQA Search 4 88.7% 2026-05-28
DeepSearchQA Search 1 0.91 2026-05-06
ProgramBench Software Engineering 2 0% 2026-05-05
SWE-bench Multilingual Software Engineering 2 77.5% 2026-05-28
SWE-bench Pro Software Engineering 5 57.3% 2026-05-28
SWE-bench Pro Software Engineering 5 53.4% 2026-04-16
SWE-bench Verified Software Engineering 1 80.8% 2026-05-28
SWE-bench Verified Software Engineering 3 80.8% 2026-04-16
SpreadsheetBench Spreadsheets 1 89.3% 2026-05-28
Structured Output Benchmark Structured Output 12 85.30 2026-05-06
LiveSQLBench Text to SQL 2 39.43 2026-05-06
BFCL-V4 Tool Use 1 76.7% 2026-05-28
WMT24++ Translation 3 82.7% 2026-05-28
CAIS Vision Capabilities Index Vision 18 48.0 2026-05-27