Claude Haiku 4.5
Claude / Anthropic
93scores
77benchmarks
$1 / $5 per 1M tokenscost in/out
Metadata
Claude Closed/API
Aliases: anthropic-claude-4.5-haiku-20251001, anthropic-claude-haiku-4.5, anthropic/claude-4.5-haiku-20251001, anthropic/claude-haiku-4.5, claude-4.5-haiku-20251001, claude-haiku-4.5
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| APEX-Agents | Agentic | 29 | 21.40 | 2026-05-06 |
| ARC-AGI-1 | Agentic | 68 | 47.67 | 2026-05-05 |
| ARC-AGI-1 | Agentic | 82 | 37.33 | 2026-05-05 |
| ARC-AGI-1 | Agentic | 103 | 25.50 | 2026-05-05 |
| ARC-AGI-1 | Agentic | 114 | 16.83 | 2026-05-05 |
| ARC-AGI-1 | Agentic | 121 | 14.33 | 2026-05-05 |
| ARC-AGI-2 | Agentic | 75 | 4.03 | 2026-05-05 |
| ARC-AGI-2 | Agentic | 83 | 2.78 | 2026-05-05 |
| ARC-AGI-2 | Agentic | 103 | 1.67 | 2026-05-05 |
| ARC-AGI-2 | Agentic | 112 | 1.25 | 2026-05-05 |
| ARC-AGI-2 | Agentic | 113 | 1.25 | 2026-05-05 |
| AutoBench | Agentic | 12 | 2.99 | 2026-05-06 |
| Berkeley Function-Calling Leaderboard | Agentic | 6 | 68.7% | 2026-05-27 |
| Berkeley Function-Calling Leaderboard | Agentic | 87 | 25.26% | 2026-05-27 |
| MCP Atlas | Agentic | 19 | 40.20 | 2026-05-06 |
| MultiChallenge | Agentic | 19 | 50.49 | 2026-05-06 |
| PinchBench | Agentic | 6 | 0.89 | 2026-05-06 |
| RealDataAgentBench | Agentic | 7 | 0.80 | 2026-04-28 |
| RuneBench | Agentic | 16 | 1.60 | 2026-05-05 |
| Tau2 Airline | Agentic | 7 | 0.64 | 2026-05-06 |
| UAVBench | Agentic | 9 | 77.05 | 2026-05-06 |
| Vending-Bench 2 | Agentic | 32 | 458.89 | 2026-05-28 |
| OpenUGI | Alignment | 1113 | 17.55 | 2026-05-06 |
| OpenUGI | Alignment | 1193 | 9.17 | 2026-05-06 |
| Arena AI Code | Coding | 54 | 1317 | 2026-05-06 |
| DeepSWE | Coding | 15 | 0.22 | 2026-05-26 |
| IOI | Coding | 36 | 6.166% | 2026-05-26 |
| LiveCodeBench | Coding | 100 | 41.175% | 2026-05-28 |
| SWE-bench Verified | Coding | 38 | 66.6% | 2026-05-28 |
| Terminal-Bench 2.0 | Coding | 35 | 38.202% | 2026-05-28 |
| Terminal-Bench 2.1 | Coding | 14 | 43.82% | 2026-05-28 |
| Vibe Code Bench v1.1 | Coding | 37 | 11.393% | 2026-05-28 |
| VibeCodingBench | Coding | 2 | 88.97 | 2026-05-06 |
| Cybersecurity CTFs | Cybersecurity | 2 | 0.47 | 2026-05-06 |
| ExploitBench v8-bench | Cybersecurity | 17 | 2.12 points | 2026-05-15 |
| ExploitBench v8-bench | Cybersecurity | 18 | 2.15 points | 2026-05-15 |
| OrgForge-IT | Cybersecurity | 3 | 0.800 | 2026-05-28 |
| SecCodeBench | Cybersecurity | 20 | 52.45% | 2026-05-28 |
| Arena AI Document | Document AI | 19 | 1424 | 2026-05-06 |
| GSMA Open Telco Leaderboard | Domain | 33 | 59.99 | 2026-05-06 |
| SAGE | Education | 42 | 31.822% | 2026-05-28 |
| Vectara HHEM Hallucination Leaderboard | Factuality | 54 | 90.20 | 2026-05-06 |
| CorpFin v2 | Finance | 45 | 60.606% | 2026-05-28 |
| CorpFin v2 | Finance | 48 | 60.295% | 2026-05-28 |
| Finance Agent v1.1 | Finance | 32 | 46.931% | 2026-05-04 |
| Finance Agent v2 | Finance | 17 | 31.01% | 2026-05-28 |
| MortgageTax | Finance | 35 | 62.162% | 2026-05-28 |
| TaxEval v2 | Finance | 78 | 67.539% | 2026-05-28 |
| InfiniteBM Chess | Game | 4 | 936.92 Elo / 12 games | 2026-05-28 |
| InfiniteBM Coup | Game | 3 | 1488.43 Elo / 29 games | 2026-05-28 |
| InfiniteBM Heads-Up No-Limit Hold'em | Game | 12 | 1256.09 Elo / 23 games | 2026-05-28 |
| InfiniteBM Heads-Up No-Limit Hold'em | Game | 16 | 1183.15 Elo / 114 games | 2026-05-28 |
| InfiniteBM Liar's Dice | Game | 33 | 932.45 Elo / 41 games | 2026-05-28 |
| InfiniteBM Liar's Dice | Game | 36 | 811.57 Elo / 116 games | 2026-05-28 |
| InfiniteBM Settlers of Catan | Game | 6 | 358.41 Elo / 21 games | 2026-05-28 |
| InfiniteBM Werewolf | Game | 5 | 1159.65 Elo / 19 games | 2026-05-28 |
| InfiniteBM Werewolf | Game | 8 | 907.28 Elo / 25 games | 2026-05-28 |
| MageBench Season 1 | Game | 10 | 1637 rating / 10 games | 2026-05-28 |
| ALL Bench LLM | General Knowledge | 28 | 24.75 | 2026-05-06 |
| BenchLM | General Knowledge | 52 | 58 | 2026-05-06 |
| MedCode | Healthcare | 49 | 32.678% | 2026-05-28 |
| MedQA | Healthcare | 74 | 79.567% | 2026-04-16 |
| MedScribe | Healthcare | 8 | 85.23% | 2026-05-28 |
| GPQA Diamond | Intelligence | 62 | 72.222% | 2026-05-28 |
| MMLU Pro | Intelligence | 72 | 78.715% | 2026-05-28 |
| MMMU Pro | Intelligence | 71 | 46.069% | 2026-05-28 |
| Vals Index | Intelligence | 17 | 40.325% | 2026-05-28 |
| Vals Multimodal Index | Intelligence | 13 | 42.352% | 2026-05-28 |
| AraGen v3 | Language | 15 | 65.80 | 2026-05-06 |
| CaseLaw v2 | Legal | 30 | 56.484% | 2026-05-04 |
| LegalBench | Legal | 50 | 81.238% | 2026-05-28 |
| AIME | Math | 44 | 82.708% | 2026-04-16 |
| MGSM | Math | 22 | 92.146% | 2026-01-09 |
| FrontierMath 2025-02-28 Private | Mathematics | 13 | 5.90 | 2026-05-06 |
| FrontierMath Tier 4 2025-07-01 Private | Mathematics | 9 | 2.08 | 2026-05-06 |
| OTIS Mock AIME 2024-2025 | Mathematics | 16 | 66.67 | 2026-05-06 |
| MedSafe-Dx | Medical | 2 | 95.6 | 2026-05-27 |
| ALL Bench Multimodal | Multimodal | 20 | 30.14 | 2026-05-06 |
| Blueprint-Bench 2 | Multimodal | 10 | 0.367 +/- 0.017 | 2026-05-28 |
| Design Arena | Multimodal | 73 | 1174 | 2026-05-06 |
| IDP Leaderboard | Multimodal | 16 | 71.24 | 2026-05-06 |
| CAIS Text Capabilities Index | Reasoning | 28 | 17.4 | 2026-05-27 |
| Context Arena | Reasoning | 35 | 36.27 | 2026-05-06 |
| Context Arena | Reasoning | 66 | 17.68 | 2026-05-06 |
| CAIS Risk Index | Safety | 14 | 47.0 | 2026-05-27 |
| HarmActionsEval | Safety | 8 | 0 | 2026-05-06 |
| LiveSecBench | Safety | 1 | 91.43 | 2026-05-27 |
| BioMysteryBench Human-Difficult | Science | 5 | 5.2% | 2026-04-29 |
| BioMysteryBench Human-Solvable | Science | 5 | 36.8% | 2026-04-29 |
| IDE-Bench | Software Engineering | 4 | 78.75 | 2026-05-27 |
| ProgramBench | Software Engineering | 7 | 0% | 2026-05-05 |
| SWE-PRBench | Software Engineering | 1 | 0.153 | 2026-05-27 |
| CAIS Vision Capabilities Index | Vision | 28 | 42.9 | 2026-05-27 |
No matching rows.