Claude Haiku 4.5

Claude / Anthropic

93scores
77benchmarks
$1 / $5 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-4.5-haiku-20251001, anthropic-claude-haiku-4.5, anthropic/claude-4.5-haiku-20251001, anthropic/claude-haiku-4.5, claude-4.5-haiku-20251001, claude-haiku-4.5

Benchmark Results

Benchmark Category Rank Score Sampled
APEX-Agents Agentic 29 21.40 2026-05-06
ARC-AGI-1 Agentic 68 47.67 2026-05-05
ARC-AGI-1 Agentic 82 37.33 2026-05-05
ARC-AGI-1 Agentic 103 25.50 2026-05-05
ARC-AGI-1 Agentic 114 16.83 2026-05-05
ARC-AGI-1 Agentic 121 14.33 2026-05-05
ARC-AGI-2 Agentic 75 4.03 2026-05-05
ARC-AGI-2 Agentic 83 2.78 2026-05-05
ARC-AGI-2 Agentic 103 1.67 2026-05-05
ARC-AGI-2 Agentic 112 1.25 2026-05-05
ARC-AGI-2 Agentic 113 1.25 2026-05-05
AutoBench Agentic 12 2.99 2026-05-06
Berkeley Function-Calling Leaderboard Agentic 6 68.7% 2026-05-27
Berkeley Function-Calling Leaderboard Agentic 87 25.26% 2026-05-27
MCP Atlas Agentic 19 40.20 2026-05-06
MultiChallenge Agentic 19 50.49 2026-05-06
PinchBench Agentic 6 0.89 2026-05-06
RealDataAgentBench Agentic 7 0.80 2026-04-28
RuneBench Agentic 16 1.60 2026-05-05
Tau2 Airline Agentic 7 0.64 2026-05-06
UAVBench Agentic 9 77.05 2026-05-06
Vending-Bench 2 Agentic 32 458.89 2026-05-28
OpenUGI Alignment 1113 17.55 2026-05-06
OpenUGI Alignment 1193 9.17 2026-05-06
Arena AI Code Coding 54 1317 2026-05-06
DeepSWE Coding 15 0.22 2026-05-26
IOI Coding 36 6.166% 2026-05-26
LiveCodeBench Coding 100 41.175% 2026-05-28
SWE-bench Verified Coding 38 66.6% 2026-05-28
Terminal-Bench 2.0 Coding 35 38.202% 2026-05-28
Terminal-Bench 2.1 Coding 14 43.82% 2026-05-28
Vibe Code Bench v1.1 Coding 37 11.393% 2026-05-28
VibeCodingBench Coding 2 88.97 2026-05-06
Cybersecurity CTFs Cybersecurity 2 0.47 2026-05-06
ExploitBench v8-bench Cybersecurity 17 2.12 points 2026-05-15
ExploitBench v8-bench Cybersecurity 18 2.15 points 2026-05-15
OrgForge-IT Cybersecurity 3 0.800 2026-05-28
SecCodeBench Cybersecurity 20 52.45% 2026-05-28
Arena AI Document Document AI 19 1424 2026-05-06
GSMA Open Telco Leaderboard Domain 33 59.99 2026-05-06
SAGE Education 42 31.822% 2026-05-28
Vectara HHEM Hallucination Leaderboard Factuality 54 90.20 2026-05-06
CorpFin v2 Finance 45 60.606% 2026-05-28
CorpFin v2 Finance 48 60.295% 2026-05-28
Finance Agent v1.1 Finance 32 46.931% 2026-05-04
Finance Agent v2 Finance 17 31.01% 2026-05-28
MortgageTax Finance 35 62.162% 2026-05-28
TaxEval v2 Finance 78 67.539% 2026-05-28
InfiniteBM Chess Game 4 936.92 Elo / 12 games 2026-05-28
InfiniteBM Coup Game 3 1488.43 Elo / 29 games 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 12 1256.09 Elo / 23 games 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 16 1183.15 Elo / 114 games 2026-05-28
InfiniteBM Liar's Dice Game 33 932.45 Elo / 41 games 2026-05-28
InfiniteBM Liar's Dice Game 36 811.57 Elo / 116 games 2026-05-28
InfiniteBM Settlers of Catan Game 6 358.41 Elo / 21 games 2026-05-28
InfiniteBM Werewolf Game 5 1159.65 Elo / 19 games 2026-05-28
InfiniteBM Werewolf Game 8 907.28 Elo / 25 games 2026-05-28
MageBench Season 1 Game 10 1637 rating / 10 games 2026-05-28
ALL Bench LLM General Knowledge 28 24.75 2026-05-06
BenchLM General Knowledge 52 58 2026-05-06
MedCode Healthcare 49 32.678% 2026-05-28
MedQA Healthcare 74 79.567% 2026-04-16
MedScribe Healthcare 8 85.23% 2026-05-28
GPQA Diamond Intelligence 62 72.222% 2026-05-28
MMLU Pro Intelligence 72 78.715% 2026-05-28
MMMU Pro Intelligence 71 46.069% 2026-05-28
Vals Index Intelligence 17 40.325% 2026-05-28
Vals Multimodal Index Intelligence 13 42.352% 2026-05-28
AraGen v3 Language 15 65.80 2026-05-06
CaseLaw v2 Legal 30 56.484% 2026-05-04
LegalBench Legal 50 81.238% 2026-05-28
AIME Math 44 82.708% 2026-04-16
MGSM Math 22 92.146% 2026-01-09
FrontierMath 2025-02-28 Private Mathematics 13 5.90 2026-05-06
FrontierMath Tier 4 2025-07-01 Private Mathematics 9 2.08 2026-05-06
OTIS Mock AIME 2024-2025 Mathematics 16 66.67 2026-05-06
MedSafe-Dx Medical 2 95.6 2026-05-27
ALL Bench Multimodal Multimodal 20 30.14 2026-05-06
Blueprint-Bench 2 Multimodal 10 0.367 +/- 0.017 2026-05-28
Design Arena Multimodal 73 1174 2026-05-06
IDP Leaderboard Multimodal 16 71.24 2026-05-06
CAIS Text Capabilities Index Reasoning 28 17.4 2026-05-27
Context Arena Reasoning 35 36.27 2026-05-06
Context Arena Reasoning 66 17.68 2026-05-06
CAIS Risk Index Safety 14 47.0 2026-05-27
HarmActionsEval Safety 8 0 2026-05-06
LiveSecBench Safety 1 91.43 2026-05-27
BioMysteryBench Human-Difficult Science 5 5.2% 2026-04-29
BioMysteryBench Human-Solvable Science 5 36.8% 2026-04-29
IDE-Bench Software Engineering 4 78.75 2026-05-27
ProgramBench Software Engineering 7 0% 2026-05-05
SWE-PRBench Software Engineering 1 0.153 2026-05-27
CAIS Vision Capabilities Index Vision 28 42.9 2026-05-27