Claude Opus 4.8

Claude / Anthropic

83scores
77benchmarks
$5 / $25 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-opus-4.8, anthropic/claude-opus-4.8, claude-opus-4.8, Opus 4.8, Claude Opus 4.8

Official Sources

2 linked sources

Benchmark Results

Benchmark Category Rank Score Sampled
AutomationBench Agentic 1 15.5% 2026-05-28
BrowseComp Agentic 3 84.3% 2026-05-28
DRACO Agentic 1 80.4% 2026-05-28
GDPval-AA Agentic 1 1890 Elo 2026-05-28
MCP Atlas Agentic 1 82.2% 2026-05-28
OSWorld-Verified Agentic 1 83.4% 2026-05-28
ScreenSpot-Pro Agentic 1 87.9% 2026-05-28
Toolathlon Agentic 1 59.9% 2026-05-28
Vending-Bench 2 Agentic 8 5787.43 2026-05-28
Vending-Bench 2 Agentic 22 2992.34 2026-05-28
Vending-Bench 2 Agentic 3 5787.4 USD 2026-05-28
Vending-Bench 2 Agentic 4 2992.3 USD 2026-05-28
AAV Capsid Packaging Prediction Biology 2 0.8 2026-05-28
BioPipelineBench Verified Biology 2 87.7% 2026-05-28
Black-box RNA Sequence Design Biology 2 10.1 2026-05-28
DNA Synthesis Screening Evasion Biology 2 0.7 2026-05-28
LABBench2 Clinical Trials Biology 2 85.3% 2026-05-28
LABBench2 Patent Questions Biology 1 68.8% 2026-05-28
LABBench2 Reading Tables Biology 1 77.2% 2026-05-28
LABBench2 Supplementary Materials Biology 1 58.9% 2026-05-28
Long-form Virology Task 1 Biology 2 0.8 2026-05-28
Long-form Virology Task 2 Biology 2 0.9 2026-05-28
ProteinGym Hard Biology 2 39.6% 2026-05-28
Protocol Troubleshooting (Anthropic Internal) Biology 2 59.6% 2026-05-28
scBench Biology 2 58.2% 2026-05-28
SpatialBench Biology 2 53.3% 2026-05-28
Structural Biology Open-Ended Biology 2 79% 2026-05-28
VCT Virology Capabilities Test Biology 2 0.5 2026-05-28
Organic Chemistry (Anthropic Internal) Chemistry 2 86.2% 2026-05-28
FrontierSWE Coding 1 2.7 avg rank 2026-05-28
LiveCodeBench Coding 3 87.819% 2026-05-28
SWE-bench Verified Coding 1 88.6% 2026-05-28
Terminal-Bench 2.0 Coding 2 70.037% 2026-05-28
Terminal-Bench 2.1 Coding 3 71.91% 2026-05-28
Terminal-Bench 2.1 Coding 2 74.6% 2026-05-28
Vibe Code Bench v1.1 Coding 1 82.725% 2026-05-28
CyberGym Cybersecurity 2 78.8% 2026-05-28
ExploitBench v8-bench Cybersecurity 3 5.45 points 2026-05-28
ExploitBench v8-bench Cybersecurity 4 5.02 points 2026-05-28
Firefox 147 JS Exploitation Cybersecurity 2 8.8% 2026-05-28
OfficeQA (Anthropic Harness) Document AI 1 77.6% 2026-05-28
OfficeQA Pro (Anthropic Harness) Document AI 1 66.2% 2026-05-28
SAGE Education 3 54.788% 2026-05-28
CorpFin v2 Finance 8 66.706% 2026-05-28
Finance Agent v2 Finance 2 53.918% 2026-05-28
Finance Agent v2 Finance 1 53.9% 2026-05-28
MortgageTax Finance 2 69.912% 2026-05-28
TaxEval v2 Finance 7 75.634% 2026-05-28
HealthBench Professional Healthcare 1 55.8% 2026-05-28
MedCode Healthcare 5 53.217% 2026-05-28
MedScribe Healthcare 6 85.755% 2026-05-28
GPQA Diamond Intelligence 4 92.424% 2026-05-28
Humanity's Last Exam Intelligence 1 57.9% 2026-05-28
MMLU Pro Intelligence 4 89.585% 2026-05-28
MMMU Pro Intelligence 9 86.59% 2026-05-28
Vals Index Intelligence 1 70.166% 2026-05-28
Vals Multimodal Index Intelligence 1 70.712% 2026-05-28
Harvey Legal Agent Benchmark Legal 1 9.6% 2026-05-28
LegalBench Legal 27 83.568% 2026-05-28
Graphwalks BFS 1M F1 Long Context 1 68.1% 2026-05-28
Graphwalks BFS 256k F1 Long Context 1 85.9% 2026-05-28
Graphwalks Parents 1M F1 Long Context 1 83.3% 2026-05-28
Graphwalks Parents 256k F1 Long Context 1 99.3% 2026-05-28
ProofBench Math 2 69% 2026-05-28
ArxivMath Mathematics 1 71.8% 2026-05-28
USAMO 2026 Mathematics 1 96.7% 2026-05-28
Global MMLU Multilingual 3 90.4% 2026-05-28
INCLUDE Multilingual 1 87.6% 2026-05-28
MILU Multilingual 1 90.3% 2026-05-28
Blueprint-Bench 2 Multimodal 7 0.606 +/- 0.010 2026-05-28
ChartMuseum Multimodal 1 89.7% 2026-05-28
ChartQAPro Multimodal 1 72.3% 2026-05-28
CharXiv-R Multimodal 2 89.9% 2026-05-28
FigQA Multimodal 1 87.3% 2026-05-28
GPQA Diamond Reasoning 3 93.6% 2026-05-28
BioMysteryBench Human-Difficult Science 1 40% 2026-05-28
BioMysteryBench Human-Solvable Science 2 80.4% 2026-05-28
DeepSearchQA Search 2 93.1% 2026-05-28
ProgramBench (Anthropic Harness) Software Engineering 1 88% 2026-05-28
SWE-bench Multilingual Software Engineering 1 84.4% 2026-05-28
SWE-bench Multimodal Software Engineering 1 38.4% 2026-05-28
SWE-bench Pro Software Engineering 1 69.2% 2026-05-28
SWE-bench Verified Software Engineering 1 88.6% 2026-05-28