Gemini 3

Gemini / Google

133scores
116benchmarks
cost in/out

Metadata

Gemini Closed/API

Aliases: gemini-3, google-gemini-3, google/gemini-3

Benchmark Results

Benchmark Category Rank Score Sampled
ADBench Agentic 1 83 2026-05-06
APEX-Agents Agentic 17 34.10 2026-05-06
ARC-AGI-1 Agentic 38 75 2026-05-05
ARC-AGI-2 Agentic 35 31.11 2026-05-05
Berkeley Function-Calling Leaderboard Agentic 3 72.51% 2026-05-27
Berkeley Function-Calling Leaderboard Agentic 7 68.14% 2026-05-27
EnterpriseOps-Gym Agentic 9 27.4% 2026-05-05
Gert Labs Rankings Agentic 14 0.56 2026-05-11
LLM-WikiRace Agentic 2 66 2026-05-06
MCP Atlas Agentic 5 70.30 2026-05-06
MCPMark Agentic 2 0.54 2026-05-06
MCPMark Agentic 5 0.51 2026-05-06
MultiChallenge Agentic 3 65.67 2026-05-06
PinchBench Agentic 56 0.71 2026-05-06
Poker Agent Agentic 6 1078.905% 2025-12-23
RuneBench Agentic 10 3.80 2026-05-05
t2-bench Agentic 7 0.85 2026-05-06
Tau2-Bench Telecom Agentic 70 87.1% 2026-05-11
Tau2-Bench Telecom Agentic 143 68.1% 2026-05-11
Terminal-Bench Hard Agentic 33 41.7% 2026-05-11
Terminal-Bench Hard Agentic 70 34.1% 2026-05-11
Vending-Bench 2 Agentic 10 5478.16 2026-05-28
VitaBench Agentic 2 31.50 2026-05-06
VitaBench Agentic 15 30 2026-05-06
OpenUGI Alignment 321 41.71 2026-05-06
OpenUGI Alignment 463 37.82 2026-05-06
ALE-Bench Coding 17 1176.75 2026-05-06
ALE-Bench Coding 29 988.23 2026-05-06
Arena AI Code Coding 20 1438 2026-05-06
HoudiniVexBench Coding 2 0.50 2026-05-06
IOI Coding 7 38.834% 2026-05-26
LiveCodeBench Coding 11 86.407% 2026-05-28
LMArena WebDev Arena Coding 20 1438.22 2026-05-06
SciCode Coding 3 56.1% 2026-05-11
SciCode Coding 21 49.9% 2026-05-11
SWE-bench Verified Coding 12 76.4% 2026-05-28
Terminal-Bench 2.0 Coding 15 55.056% 2026-05-28
Vibe Code Bench v1.1 Coding 31 14.3% 2026-05-28
VibeCodingBench Coding 12 85.80 2026-05-06
SecCodeBench Cybersecurity 4 62.42% 2026-05-28
OmniDocBench 1.5 Document Understanding 11 0.12 2026-05-06
Arena AI Document Document AI 15 1442 2026-05-06
GSMA Open Telco Leaderboard Domain 3 74.65 2026-05-06
SAGE Education 15 47.615% 2026-05-28
TutorBench Education 2 53.67 2026-05-06
From Perception to Action Embodied AI 2 19.3% 2026-05-28
Vectara HHEM Hallucination Leaderboard Factuality 83 86.40 2026-05-06
CorpFin v2 Finance 25 63.675% 2026-05-28
Finance Agent v1.1 Finance 14 55.154% 2026-05-04
MortgageTax Finance 4 69.078% 2026-05-28
PRBench Finance Finance 16 39.18 2026-05-06
QuantSightBench Finance 9 0.6543 coverage 2026-05-28
TaxEval v2 Finance 39 72.568% 2026-05-28
MageBench Season 1 Game 3 1722 rating / 11 games 2026-05-28
ALL Bench LLM General Knowledge 27 25.55 2026-05-06
BenchLM General Knowledge 18 81 2026-05-06
HELM AIR-Bench Generalization 33 0.732086 2026-05-28
LMArena Text Arena Generalization 5 1479.29 2026-05-06
WeirdML Generalization 2 69.93 2026-05-06
GeoRC Geospatial 8 40.98 2026-05-27
MedCode Healthcare 7 52.198% 2026-05-28
MedQA Healthcare 8 96.033% 2026-04-16
MedScribe Healthcare 47 72.036% 2026-05-28
Omi SOAP Note Safety Benchmark Healthcare 2 4.70 2026-04-21
PlaceboBench Healthcare 1 73.913 2026-05-27
HUMAINE Human Preference 38 3.34 2026-05-06
AIIQ Composite IQ Intelligence 6 126 2026-05-12
Artificial Analysis Intelligence Index Intelligence 31 48.39 2026-05-11
Artificial Analysis Intelligence Index Intelligence 68 41.3 2026-05-11
GPQA Diamond Intelligence 5 91.666% 2026-05-28
Humanity's Last Exam Intelligence 9 37.2% 2026-05-11
Humanity's Last Exam Intelligence 34 27.6% 2026-05-11
MathVision Intelligence 8 86.60 2026-05-06
MMLU Pro Intelligence 2 90.102% 2026-05-28
MMLU-Pro Intelligence 1 89.8% 2026-05-11
MMLU-Pro Intelligence 3 89.5% 2026-05-11
MMMU Pro Intelligence 5 87.514% 2026-05-28
OCRBench v2 Intelligence 3 63.40 2026-05-06
OCRBench v2 Intelligence 3 63.80 2026-05-06
AraGen v3 Language 16 64.15 2026-05-06
CaseLaw v2 Legal 42 53.055% 2026-05-04
LegalBench Legal 2 87.025% 2026-05-28
LEXam Legal 12 55.38% open questions 2026-05-28
Professional Reasoning Bench - Legal Legal 14 40.60 2026-05-06
MRCR v2 (8-needle) Long Context 6 0.26 2026-05-06
AIME Math 4 96.68% 2026-04-16
AIME 2025 Math 7 95.7% 2026-05-11
AIME 2025 Math 43 86.7% 2026-05-11
MATH 500 Math 1 96.4% 2026-01-09
MGSM Math 7 93.927% 2026-01-09
ProofBench Math 14 20% 2026-05-28
FrontierMath 2025-02-28 Private Mathematics 2 37.60 2026-05-06
FrontierMath Tier 4 2025-07-01 Private Mathematics 2 18.75 2026-05-06
MathArena Apex Mathematics 3 0.23 2026-05-06
OTIS Mock AIME 2024-2025 Mathematics 2 92.78 2026-05-06
LiveMedBench Medical 9 0.1829 2026-05-27
Medmarks Medical 8 0.4712656820900838 2026-05-27
Medmarks Medical 1 0.6627770031943667 2026-05-27
MedSafe-Dx Medical 11 62.4 2026-05-27
AfroBench-Lite Multilingual 2 76.01 2026-05-06
ALL Bench Multimodal Multimodal 22 29.76 2026-05-06
ALL Bench Multimodal Multimodal 5 16.75 2026-05-06
CharXiv-R Multimodal 7 0.81 2026-05-06
IDP Leaderboard Multimodal 3 82.77 2026-05-06
JMMMU-Pro Multimodal 1 87.05 2026-05-06
LMArena Vision Arena Multimodal 6 1304.72 2026-05-06
MMLongBench-Doc Multimodal 3 60.50 2026-05-06
MMSI-Bench Multimodal 2 49.2% 2026-05-28
VideoMMMU Multimodal 1 0.88 2026-05-06
Visual-Language Understanding Multimodal 3 51.49 2026-05-06
VPCT Multimodal 1 91 2026-05-06
VTB Multimodal 3 26.85 2026-05-06
Artificial Analysis Openness Index Openness 216 5.56 2026-05-11
ARC-AGI v2 Reasoning 11 0.31 2026-05-06
CAIS Text Capabilities Index Reasoning 7 38.4 2026-05-27
EnigmaEval Reasoning 2 18.24 2026-05-06
FINAL Bench Metacognitive Reasoning 2 77.08 2026-05-06
Global PIQA Reasoning 1 0.93 2026-05-06
GPQA Diamond Reasoning 11 90.8% 2026-05-11
GPQA Diamond Reasoning 22 88.7% 2026-05-11
Humanity's Last Exam (Text Only) Reasoning 3 37.72 2026-05-06
MultiNRC Reasoning 2 58.96 2026-05-06
SimpleBench Reasoning 1 76.40 2026-05-06
CAIS Risk Index Safety 30 59.9 2026-05-27
CritPt Science 18 9.1% 2026-05-11
CritPt Science 189 0% 2026-05-11
GSO-Bench Science 3 18.60 2026-05-06
SciPredict Science 1 25.27 2026-05-06
IDE-Bench Software Engineering 8 55 2026-05-27
AudioMC Speech 1 54.65 2026-05-07
AudioMC - Text Output Speech 1 54.65 2026-05-06
CAIS Vision Capabilities Index Vision 7 55.5 2026-05-27
K-MetBench Weather 1 93.7% accuracy 2026-05-28