Llama 4 Scout

Llama / Meta

57scores
55benchmarks
$0.08 / $0.3 per 1M tokenscost in/out

Metadata

Llama Open source

Aliases: llama-4-scout, llama-4-scout-17b-16e-instruct, meta-llama-llama-4-scout, meta-llama-llama-4-scout-17b-16e-instruct, meta-llama/llama-4-scout, meta-llama/llama-4-scout-17b-16e-instruct

Benchmark Results

Benchmark Category Rank Score Sampled
ARC-AGI-1 Agentic 142 0.50 2026-05-05
ARC-AGI-2 Agentic 134 0 2026-05-05
Berkeley Function-Calling Leaderboard Agentic 72 28.13% 2026-05-27
PinchBench Agentic 68 0.08 2026-05-06
Tau2-Bench Telecom Agentic 345 15.5% 2026-05-11
Terminal-Bench Hard Agentic 330 1.5% 2026-05-11
UAVBench Agentic 16 75.10 2026-05-06
OpenUGI Alignment 779 31.02 2026-05-06
Stick To Your Role! Alignment 18 0.62 2026-05-06
TextClass Benchmark Classification 70 1500.45 2026-05-06
LiveCodeBench Coding 102 38.541% 2026-05-28
SciCode Coding 391 17% 2026-05-11
NeoEvalPlusN Creative 131 10.25 2026-05-06
MMTU Data 22 0.39 2026-05-06
VAREX-Bench Document Understanding 7 94.3% EM 2026-05-28
SAGE Education 39 34.834% 2026-05-28
kluster.ai LLM Hallucination Detection Leaderboard Factuality 10 96.64 2026-05-06
Vectara HHEM Hallucination Leaderboard Factuality 37 92.30 2026-05-06
BizFinBench Finance 15 61.17 2026-05-27
CorpFin v2 Finance 88 46.776% 2026-05-28
MortgageTax Finance 50 57.75% 2026-05-28
TaxEval v2 Finance 108 55.192% 2026-05-28
ALL Bench LLM General Knowledge 26 26.02 2026-05-06
BenchLM General Knowledge 106 22 2026-05-06
HealthBench Hard Healthcare 33 0.32 2026-05-27
MedCode Healthcare 59 23.311% 2026-05-28
MedQA Healthcare 92 50.9% 2026-04-16
MedScribe Healthcare 60 50.593% 2026-05-28
Artificial Analysis Intelligence Index Intelligence 357 13.52 2026-05-11
GPQA Diamond Intelligence 99 46.97% 2026-05-28
Humanity's Last Exam Intelligence 378 4.3% 2026-05-11
MMLU Pro Intelligence 94 69.632% 2026-05-28
MMLU-Pro Intelligence 175 75.2% 2026-05-11
MMMU Pro Intelligence 65 58.752% 2026-05-28
LegalBench Legal 82 72.036% 2026-05-28
Fiction.LiveBench Long Context 22 27.30 2026-05-06
AIME Math 80 18.958% 2026-04-16
AIME 2025 Math 221 14% 2026-05-11
IneqMath Math 48 1.50 2026-05-06
MATH 500 Math 40 79.2% 2026-01-09
MGSM Math 56 87.964% 2026-01-09
FrontierMath 2025-02-28 Private Mathematics 24 0 2026-05-06
OTIS Mock AIME 2024-2025 Mathematics 30 7.78 2026-05-06
BRIDGE Medical Leaderboard Medical 91 40.64 2026-05-27
BRIDGE Medical Leaderboard Medical 174 35.12 2026-05-27
BRIDGE Medical Leaderboard Medical 234 29.38 2026-05-27
MEDIC Benchmark Medical 40 63.89 average normalized public table score 2026-05-27
ALL Bench Multimodal Multimodal 25 27.51 2026-05-06
ChartQA Multimodal 5 0.89 2026-05-06
Design Arena Multimodal 122 848 2026-05-06
VTB Multimodal 19 1.58 2026-05-06
Artificial Analysis Openness Index Openness 181 27.78 2026-05-11
GPQA Diamond Reasoning 290 58.7% 2026-05-11
CritPt Science 287 0% 2026-05-11
MaCBench Science 3 0.63 2026-05-06
IDE-Bench Software Engineering 13 2.5 2026-05-27
LiveSQLBench Text to SQL 27 18.55 2026-05-06