Gemini 3 Flash Preview

Gemini / Google

121scores
94benchmarks
$0.5 / $3 per 1M tokenscost in/out

Metadata

Gemini Closed/API

Aliases: gemini-3-flash-preview, gemini-3-flash-preview-20251217, google-gemini-3-flash-preview, google-gemini-3-flash-preview-20251217, google/gemini-3-flash-preview, google/gemini-3-flash-preview-20251217

Benchmark Results

Benchmark Category Rank Score Sampled
APEX-Agents Agentic 11 39.50 2026-05-06
APEX-Agents-AA Agentic 7 27.7% 2026-05-11
ARC-AGI-1 Agentic 31 84.67 2026-05-05
ARC-AGI-1 Agentic 56 57.67 2026-05-05
ARC-AGI-1 Agentic 95 29 2026-05-05
ARC-AGI-1 Agentic 107 21.50 2026-05-05
ARC-AGI-2 Agentic 33 33.61 2026-05-05
ARC-AGI-2 Agentic 48 12.78 2026-05-05
ARC-AGI-2 Agentic 79 3.33 2026-05-05
ARC-AGI-2 Agentic 114 1.25 2026-05-05
AutoBench Agentic 13 2.98 2026-05-06
EnterpriseOps-Gym Agentic 5 31.7% 2026-05-05
Gert Labs Rankings Agentic 18 0.53 2026-05-11
MCP Atlas Agentic 11 62 2026-05-06
PinchBench Agentic 17 0.87 2026-05-06
Poker Agent Agentic 3 1100.213% 2025-12-23
t2-bench Agentic 2 0.90 2026-05-06
Tau2-Bench Telecom Agentic 106 80.4% 2026-05-11
Tau2-Bench Telecom Agentic 197 43.3% 2026-05-11
Terminal-Bench Hard Agentic 42 38.6% 2026-05-11
Terminal-Bench Hard Agentic 84 31.8% 2026-05-11
Toolathlon Agentic 5 0.49 2026-05-06
Vending-Bench 2 Agentic 19 3634.72 2026-05-28
VitaBench Agentic 1 32.50 2026-05-06
OpenUGI Alignment 120 49.47 2026-05-06
OpenUGI Alignment 129 48.87 2026-05-06
OpenUGI Alignment 142 48.30 2026-05-06
ALE-Bench Coding 6 1367.20 2026-05-06
Arena AI Code Coding 22 1437 2026-05-06
Arena AI Code Coding 36 1389 2026-05-06
DeepSWE Coding 13 5.16 2026-05-26
IOI Coding 6 39.084% 2026-05-26
LiveCodeBench Coding 14 85.591% 2026-05-28
LMArena WebDev Arena Coding 22 1437.04 2026-05-06
SciCode Coding 15 50.6% 2026-05-11
SciCode Coding 20 49.9% 2026-05-11
SWE Atlas - Codebase QnA Coding 9 8.20 2026-05-06
SWE Atlas - Refactoring Coding 11 10 2026-05-06
SWE Atlas - Test Writing Coding 2 30.30 2026-05-06
SWE-bench Verified Coding 16 75% 2026-05-28
Terminal-Bench 2.0 Coding 19 51.685% 2026-05-28
Terminal-Bench 2.1 Coding 8 53.933% 2026-05-28
Vibe Code Bench v1.1 Coding 24 20.204% 2026-05-28
VibeCodingBench Coding 15 83.44 2026-05-06
SecCodeBench Cybersecurity 10 58.66% 2026-05-28
OmniDocBench 1.5 Document Understanding 10 0.12 2026-05-06
Arena AI Document Document AI 20 1421 2026-05-06
GSMA Open Telco Leaderboard Domain 7 70.41 2026-05-06
SAGE Education 5 51.849% 2026-05-28
From Perception to Action Embodied AI 6 11.9% 2026-05-28
AA-Omniscience Factuality 6 11.57 2026-05-11
Vectara HHEM Hallucination Leaderboard Factuality 82 86.50 2026-05-06
CorpFin v2 Finance 10 66.434% 2026-05-28
Finance Agent v1.1 Finance 31 47.598% 2026-05-04
Finance Agent v2 Finance 12 42.551% 2026-05-28
MortgageTax Finance 7 68.72% 2026-05-28
Rogo Big Finance Bench Finance 7 43% rubric / 26% final 2026-05-28
TaxEval v2 Finance 28 73.876% 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 4 1409.13 Elo / 13 games 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 31 978.72 Elo / 89 games 2026-05-28
InfiniteBM Liar's Dice Game 1 1566.83 Elo / 27 games 2026-05-28
InfiniteBM Liar's Dice Game 5 1376.7 Elo / 92 games 2026-05-28
MageBench Season 1 Game 11 1622 rating / 10 games 2026-05-28
ALL Bench LLM General Knowledge 6 50.11 2026-05-06
BenchLM General Knowledge 40 65 2026-05-06
LMArena Text Arena Generalization 13 1466.61 2026-05-06
LMArena Text Arena Generalization 25 1448.34 2026-05-06
MedCode Healthcare 2 55.92% 2026-05-28
MedQA Healthcare 11 95.808% 2026-04-16
MedScribe Healthcare 50 69.917% 2026-05-28
PlaceboBench Healthcare 5 44.9275 2026-05-27
Artificial Analysis Intelligence Index Intelligence 39 46.43 2026-05-11
Artificial Analysis Intelligence Index Intelligence 109 35.05 2026-05-11
GPQA Diamond Intelligence 17 87.879% 2026-05-28
Humanity's Last Exam Intelligence 15 34.7% 2026-05-11
Humanity's Last Exam Intelligence 101 14.1% 2026-05-11
LiveBench Intelligence 19 73.05 2026-05-05
MMLU Pro Intelligence 8 88.592% 2026-05-28
MMLU-Pro Intelligence 4 89% 2026-05-11
MMLU-Pro Intelligence 6 88.2% 2026-05-11
MMMU Pro Intelligence 4 87.63% 2026-05-28
Vals Index Intelligence 12 49.314% 2026-05-28
Vals Multimodal Index Intelligence 9 51.975% 2026-05-28
CaseLaw v2 Legal 32 55.842% 2026-05-04
LegalBench Legal 3 86.858% 2026-05-28
MRCR v2 (8-needle) Long Context 8 0.22 2026-05-06
AIME Math 9 95.625% 2026-04-16
AIME 2025 Math 3 97% 2026-05-11
AIME 2025 Math 130 55.7% 2026-05-11
MGSM Math 10 93.309% 2026-01-09
ProofBench Math 19 15% 2026-05-28
LiveMedBench Medical 8 0.2167 2026-05-27
Medical Chronology LLM Benchmark Medical 4 0.91 2026-05-06
ALL Bench Multimodal Multimodal 5 51.49 2026-05-06
ALL Bench Multimodal Multimodal 4 16.76 2026-05-06
ALL Bench Multimodal Multimodal 9 8.04 2026-05-06
Blueprint-Bench 2 Multimodal 11 0.534 +/- 0.019 2026-05-28
CharXiv-R Multimodal 9 0.80 2026-05-06
Design Arena Multimodal 28 1249 2026-05-06
IDP Leaderboard Multimodal 4 81.95 2026-05-06
LMArena Vision Arena Multimodal 9 1282.62 2026-05-06
LMArena Vision Arena Multimodal 16 1264.39 2026-05-06
VideoMMMU Multimodal 2 0.87 2026-05-06
ARC-AGI v2 Reasoning 10 0.34 2026-05-06
CAIS Text Capabilities Index Reasoning 9 35.6 2026-05-27
Context Arena Reasoning 27 46.79 2026-05-06
Context Arena Reasoning 29 46.24 2026-05-06
Context Arena Reasoning 33 39.58 2026-05-06
Context Arena Reasoning 56 25.60 2026-05-06
Global PIQA Reasoning 2 0.93 2026-05-06
GPQA Diamond Reasoning 16 89.8% 2026-05-11
GPQA Diamond Reasoning 92 81.2% 2026-05-11
CAIS Risk Index Safety 29 59.4 2026-05-27
InvisibleBench Safety 9 0.09 2026-05-06
CritPt Science 20 8.6% 2026-05-11
CritPt Science 70 1.4% 2026-05-11
SciPredict Science 2 22.22 2026-05-06
ProgramBench Software Engineering 6 0% 2026-05-05
Structured Output Benchmark Structured Output 22 83.30 2026-05-06
CAIS Vision Capabilities Index Vision 2 64.9 2026-05-27
Roboflow Vision Evals - Visual Understanding Vision 3 79.1% 2026-05-22