R1 Distill Llama 70B | BenchmarkList

Metadata

Llama Open source

Aliases: deepseek-deepseek-r1-distill-llama-70b, deepseek-r1-distill-llama-70b, deepseek/deepseek-r1-distill-llama-70b

Benchmark	Category	Rank	Score	Sampled
AgentIF	Agentic	11	55	2026-05-27
Tau2-Bench Telecom	Agentic	303	21.9%	2026-05-11
Terminal-Bench Hard	Agentic	327	1.5%	2026-05-11
BigCodeBench	Coding	86	35.30	2026-05-06
SciCode	Coding	235	31.2%	2026-05-11
AI Energy Score	Efficiency	90	5	2026-05-06
AI Energy Score	Efficiency	200	1	2026-05-06
Open LLM Leaderboard v2	General Knowledge	1320	27.81	2026-05-06
HealthBench Hard	Healthcare	13	0.47	2026-05-27
Artificial Analysis Intelligence Index	Intelligence	304	15.95	2026-05-11
Humanity's Last Exam	Intelligence	238	6.1%	2026-05-11
MMLU-Pro	Intelligence	123	79.5%	2026-05-11
MuSR	Intelligence	1218	13.28	2026-05-06
Open Japanese LLM Leaderboard	Language	236	58.27	2026-05-06
Open Japanese LLM Leaderboard	Language	754	22.80	2026-05-06
Open Portuguese LLM Leaderboard	Language	760	59.63	2026-05-06
AIME 2025	Math	133	53.7%	2026-05-11
MATH Level 5	Math	801	30.74	2026-05-06
MATH-500	Mathematics	20	0.94	2026-05-06
BRIDGE Medical Leaderboard	Medical	41	46.17	2026-05-27
BRIDGE Medical Leaderboard	Medical	106	39.79	2026-05-27
BRIDGE Medical Leaderboard	Medical	121	38.95	2026-05-27
MEDIC Benchmark	Medical	27	68.84 average normalized public table score	2026-05-27
Artificial Analysis Openness Index	Openness	157	36.11	2026-05-11
GPQA Diamond	Reasoning	396	40.2%	2026-05-11
CritPt	Science	168	0%	2026-05-11
Defects4J	Software Engineering	31	0.221	2026-05-27
RepairBench	Software Engineering	30	0.208	2026-05-27