DeepSeek V3 0324

DeepSeek / DeepSeek

33scores
33benchmarks
$0.2 / $0.77 per 1M tokenscost in/out

Metadata

DeepSeek Open source

Aliases: deepseek-chat-v3-0324, deepseek-deepseek-chat-v3-0324, deepseek/deepseek-chat-v3-0324

Benchmark Results

Benchmark Category Rank Score Sampled
Tau2-Bench Telecom Agentic 183 47.1% 2026-05-11
Terminal-Bench Hard Agentic 175 15.2% 2026-05-11
UAVBench Agentic 14 75.90 2026-05-06
IOI Coding 50 1.667% 2026-05-26
LiveCodeBench Coding 71 65.478% 2026-05-28
SciCode Coding 181 35.8% 2026-05-11
GSMA Open Telco Leaderboard Domain 35 59.28 2026-05-06
kluster.ai LLM Hallucination Detection Leaderboard Factuality 7 97.22 2026-05-06
CorpFin v2 Finance 68 54.74% 2026-05-28
TaxEval v2 Finance 60 71.096% 2026-05-28
GeoCode Leaderboard Geospatial 5 70.25% pass@1 2026-05-28
MedQA Healthcare 68 82% 2026-04-16
HUMAINE Human Preference 14 3.64 2026-05-06
Artificial Analysis Intelligence Index Intelligence 223 22.28 2026-05-11
GPQA Diamond Intelligence 83 61.616% 2026-05-28
Humanity's Last Exam Intelligence 274 5.2% 2026-05-11
MMLU Pro Intelligence 66 79.474% 2026-05-28
MMLU-Pro Intelligence 78 81.9% 2026-05-11
J1-ENVS Legal 6 53.86 2026-05-26
LegalBench Legal 75 77.727% 2026-05-28
AIME Math 60 52.202% 2026-04-16
AIME 2025 Math 157 41% 2026-05-11
IneqMath Math 24 7 2026-05-06
MATH 500 Math 30 88.6% 2026-01-09
MGSM Math 27 91.673% 2026-01-09
MATH-500 Mathematics 22 0.94 2026-05-06
MedSafe-Dx Medical 7 85.2 2026-05-27
LanguageBench Multilingual 17 0.51 2026-05-06
GPQA Diamond Reasoning 247 65.5% 2026-05-11
LiveSecBench Safety 41 18.08 2026-05-27
CritPt Science 170 0% 2026-05-11
Defects4J Software Engineering 8 0.43 2026-05-27
RepairBench Software Engineering 8 0.396 2026-05-27