Claude 3.7 Sonnet

Claude / Anthropic

83scores
68benchmarks
$3 / $15 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-3-7-sonnet-20250219, anthropic-claude-3.7-sonnet, anthropic/claude-3-7-sonnet-20250219, anthropic/claude-3.7-sonnet, claude-3-7-sonnet-20250219, claude-3.7-sonnet

Benchmark Results

Benchmark Category Rank Score Sampled
ALFWorld Agentic 8 0.833 2026-05-27
MCP-Universe Agentic 14 24.24 2026-05-06
OSWorld Agentic 62 35.8% 2026-05-27
OSWorld Agentic 63 35.6% 2026-05-27
OSWorld Agentic 80 27.1% 2026-05-27
Tau2-Bench Telecom Agentic 177 50% 2026-05-11
Terminal-Bench Hard Agentic 138 21.2% 2026-05-11
WildAgtEval Agentic 5 61.6% 2026-05-28
OpenUGI Alignment 515 36.38 2026-05-06
OpenUGI Alignment 675 33.02 2026-05-06
TextClass Benchmark Classification 69 1500.76 2026-05-06
BigCodeBench-Hard Coding 4 32.40 2026-05-05
BigCodeBench-Hard Coding 5 31.80 2026-05-05
CadEval Coding 5 54 2026-05-06
LiveCodeBench Coding 82 56.662% 2026-05-28
Natural Language to Mongosh Coding 2 0.89 2026-05-06
Natural Language to Mongosh Coding 3 0.88 2026-05-06
Natural Language to Mongosh Coding 4 0.87 2026-05-06
Natural Language to Mongosh Coding 5 0.87 2026-05-06
Natural Language to Mongosh Coding 6 0.87 2026-05-06
Natural Language to Mongosh Coding 8 0.86 2026-05-06
Natural Language to Mongosh Coding 9 0.86 2026-05-06
Natural Language to Mongosh Coding 15 0.86 2026-05-06
Natural Language to Mongosh Coding 16 0.86 2026-05-06
Natural Language to Mongosh Coding 22 0.85 2026-05-06
Natural Language to Mongosh Coding 28 0.84 2026-05-06
SciCode Coding 142 37.6% 2026-05-11
AIRTBench Cybersecurity 1 46.86 2026-05-06
GSMA Open Telco Leaderboard Domain 17 65.56 2026-05-06
K-12EduBench Education 17 61.20 2026-05-27
RoboBench Embodied 6 40.53 2026-05-27
FinEval Finance 29 62.9 2026-05-27
MortgageTax Finance 8 68.68% 2026-05-28
TaxEval v2 Finance 40 72.404% 2026-05-28
HELM AIR-Bench Generalization 21 0.817703 2026-05-28
HELM Safety Generalization 18 0.944914 2026-05-28
WeirdML Generalization 15 39.97 2026-05-06
GeoCode Leaderboard Geospatial 4 70.35% pass@1 2026-05-28
OmniEarth-Bench Geospatial 4 29.07 2026-05-27
HELM MedQA Healthcare 8 0.856859 2026-05-28
HUMAINE Human Preference 31 3.40 2026-05-06
Artificial Analysis Intelligence Index Intelligence 142 30.81 2026-05-11
GPQA Diamond Intelligence 74 67.424% 2026-05-28
Humanity's Last Exam Intelligence 322 4.8% 2026-05-11
MathVision Intelligence 43 58.60 2026-05-06
MMLU Pro Intelligence 57 80.663% 2026-05-28
MMLU-Pro Intelligence 110 80.3% 2026-05-11
MMMU Pro Intelligence 48 71.519% 2026-05-28
AraGen v3 Language 7 78.16 2026-05-06
HindiGen v1 Language 12 70.77 2026-05-06
WinoGrande Language 17 75.10 2026-05-06
LegalBench Legal 60 80.001% 2026-05-28
LEXam Legal 3 62.86% open / 57.23% MCQ 2026-05-28
Fiction.LiveBench Long Context 13 53.10 2026-05-06
AIME Math 79 22.292% 2026-04-16
AIME 2025 Math 208 21% 2026-05-11
IneqMath Math 45 2 2026-05-06
IneqMath Math 50 1 2026-05-06
MATH 500 Math 43 76.8% 2026-01-09
MGSM Math 19 92.4% 2026-01-09
FrontierMath 2025-02-28 Private Mathematics 17 4.14 2026-05-06
FrontierMath Tier 4 2025-07-01 Private Mathematics 12 0 2026-05-06
MATH-500 Mathematics 14 0.96 2026-05-06
OTIS Mock AIME 2024-2025 Mathematics 18 57.78 2026-05-06
LiveMedBench Medical 11 0.1699 2026-05-27
MedHELM Medical 3 0.6357142857142857 2026-05-27
AfroBench-Lite Multilingual 11 60.26 2026-05-06
LanguageBench Multilingual 3 0.68 2026-05-06
Design Arena Multimodal 37 1235 2026-05-06
Video SimpleQA Multimodal 9 36.20 2026-05-06
Visual-Language Understanding Multimodal 34 43.02 2026-05-06
VPCT Multimodal 9 39 2026-05-06
Balrog Reasoning 5 32.60 2026-05-06
EnigmaEval Reasoning 25 2.26 2026-05-06
GPQA Diamond Reasoning 245 65.6% 2026-05-11
LingOly-TOO Reasoning 3 0.43 2026-05-06
SimpleBench Reasoning 7 46.40 2026-05-06
CritPt Science 160 0% 2026-05-11
GSO-Bench Science 7 4.60 2026-05-06
Defects4J Software Engineering 3 0.478 2026-05-27
RepairBench Software Engineering 4 0.44 2026-05-27
LiveSQLBench Text to SQL 21 25.75 2026-05-06
Lech Mazur Writing Writing 13 8.11 2026-05-06