DeepSeek V3 0324
DeepSeek / DeepSeek
33scores
33benchmarks
$0.2 / $0.77 per 1M tokenscost in/out
Metadata
DeepSeek Open source
Aliases: deepseek-chat-v3-0324, deepseek-deepseek-chat-v3-0324, deepseek/deepseek-chat-v3-0324
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| Tau2-Bench Telecom | Agentic | 183 | 47.1% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 175 | 15.2% | 2026-05-11 |
| UAVBench | Agentic | 14 | 75.90 | 2026-05-06 |
| IOI | Coding | 50 | 1.667% | 2026-05-26 |
| LiveCodeBench | Coding | 71 | 65.478% | 2026-05-28 |
| SciCode | Coding | 181 | 35.8% | 2026-05-11 |
| GSMA Open Telco Leaderboard | Domain | 35 | 59.28 | 2026-05-06 |
| kluster.ai LLM Hallucination Detection Leaderboard | Factuality | 7 | 97.22 | 2026-05-06 |
| CorpFin v2 | Finance | 68 | 54.74% | 2026-05-28 |
| TaxEval v2 | Finance | 60 | 71.096% | 2026-05-28 |
| GeoCode Leaderboard | Geospatial | 5 | 70.25% pass@1 | 2026-05-28 |
| MedQA | Healthcare | 68 | 82% | 2026-04-16 |
| HUMAINE | Human Preference | 14 | 3.64 | 2026-05-06 |
| Artificial Analysis Intelligence Index | Intelligence | 223 | 22.28 | 2026-05-11 |
| GPQA Diamond | Intelligence | 83 | 61.616% | 2026-05-28 |
| Humanity's Last Exam | Intelligence | 274 | 5.2% | 2026-05-11 |
| MMLU Pro | Intelligence | 66 | 79.474% | 2026-05-28 |
| MMLU-Pro | Intelligence | 78 | 81.9% | 2026-05-11 |
| J1-ENVS | Legal | 6 | 53.86 | 2026-05-26 |
| LegalBench | Legal | 75 | 77.727% | 2026-05-28 |
| AIME | Math | 60 | 52.202% | 2026-04-16 |
| AIME 2025 | Math | 157 | 41% | 2026-05-11 |
| IneqMath | Math | 24 | 7 | 2026-05-06 |
| MATH 500 | Math | 30 | 88.6% | 2026-01-09 |
| MGSM | Math | 27 | 91.673% | 2026-01-09 |
| MATH-500 | Mathematics | 22 | 0.94 | 2026-05-06 |
| MedSafe-Dx | Medical | 7 | 85.2 | 2026-05-27 |
| LanguageBench | Multilingual | 17 | 0.51 | 2026-05-06 |
| GPQA Diamond | Reasoning | 247 | 65.5% | 2026-05-11 |
| LiveSecBench | Safety | 41 | 18.08 | 2026-05-27 |
| CritPt | Science | 170 | 0% | 2026-05-11 |
| Defects4J | Software Engineering | 8 | 0.43 | 2026-05-27 |
| RepairBench | Software Engineering | 8 | 0.396 | 2026-05-27 |
No matching rows.