DeepSeek V4 Flash

DeepSeek / DeepSeek

51scores
29benchmarks
$0.14 / $0.28 per 1M tokenscost in/out

Metadata

DeepSeek Open source

Aliases: deepseek-deepseek-v4-flash, deepseek-deepseek-v4-flash-20260423, deepseek-v4-flash, deepseek-v4-flash-20260423, deepseek/deepseek-v4-flash, deepseek/deepseek-v4-flash-20260423

Benchmark Results

Benchmark Category Rank Score Sampled
GDPval-AA Agentic 8 1395 2026-05-06
Gert Labs Rankings Agentic 21 0.52 2026-05-11
ITBench-AA Agentic 15 31.5% 2026-05-28
Tau2-Bench Telecom Agentic 16 95.6% 2026-05-11
Tau2-Bench Telecom Agentic 21 95% 2026-05-11
Tau2-Bench Telecom Agentic 24 94.4% 2026-05-11
Terminal-Bench Hard Agentic 41 38.6% 2026-05-11
Terminal-Bench Hard Agentic 56 35.6% 2026-05-11
Terminal-Bench Hard Agentic 69 34.1% 2026-05-11
Toolathlon Agentic 6 0.48 2026-05-06
OpenUGI Alignment 17 59.21 2026-05-06
OpenUGI Alignment 29 56.63 2026-05-06
ALE-Bench Coding 50 678.20 2026-05-06
ALE-Bench Coding 80 324.98 2026-05-06
BLXBench Coding 13 48.30 2026-05-06
Codeforces Coding 1 1 2026-05-28
SciCode Coding 42 44.9% 2026-05-11
SciCode Coding 68 42% 2026-05-11
SciCode Coding 150 37.3% 2026-05-11
AA-Omniscience Factuality 19 -22.9 2026-05-11
InfiniteBM Heads-Up No-Limit Hold'em Game 6 1359.06 Elo / 13 games 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 14 1212.44 Elo / 109 games 2026-05-28
InfiniteBM Liar's Dice Game 9 1326.72 Elo / 26 games 2026-05-28
InfiniteBM Liar's Dice Game 31 1036.17 Elo / 111 games 2026-05-28
BenchLM General Knowledge 26 76 2026-05-06
BenchLM General Knowledge 31 71 2026-05-06
BenchLM General Knowledge 49 59 2026-05-06
CSimpleQA General Knowledge 4 0.79 2026-05-06
HUMAINE Human Preference 12 3.66 2026-05-06
Artificial Analysis Intelligence Index Intelligence 37 46.52 2026-05-11
Artificial Analysis Intelligence Index Intelligence 43 44.87 2026-05-11
Artificial Analysis Intelligence Index Intelligence 99 36.46 2026-05-11
Humanity's Last Exam Intelligence 21 32.1% 2026-05-11
Humanity's Last Exam Intelligence 33 27.8% 2026-05-11
Humanity's Last Exam Intelligence 212 7% 2026-05-11
LiveBench Intelligence 40 67.67 2026-05-05
CorpusQA 1M Long Context 2 0.60 2026-05-06
MRCR 1M Long Context 2 0.79 2026-05-06
IMO-AnswerBench Mathematics 2 0.88 2026-05-06
MathArena Apex Mathematics 2 0.86 2026-05-06
Design Arena Multimodal 22 1280 2026-05-06
Artificial Analysis Openness Index Openness 45 50 2026-05-11
Artificial Analysis Openness Index Openness 46 50 2026-05-11
Context Arena Reasoning 22 50.93 2026-05-06
Context Arena Reasoning 60 23.47 2026-05-06
GPQA Diamond Reasoning 18 89.4% 2026-05-11
GPQA Diamond Reasoning 37 86.7% 2026-05-11
GPQA Diamond Reasoning 190 71.6% 2026-05-11
CritPt Science 27 7.1% 2026-05-11
CritPt Science 43 3.4% 2026-05-11
CritPt Science 125 0.3% 2026-05-11