Gemini 2.0 Flash
Gemini / Google
66scores
59benchmarks
$0.1 / $0.4 per 1M tokenscost in/out
Metadata
Gemini Closed/API
Aliases: gemini-2.0-flash-001, google-gemini-2.0-flash-001, google/gemini-2.0-flash-001
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| ARC-AGI-2 | Agentic | 106 | 1.30 | 2026-05-05 |
| LLM-WikiRace | Agentic | 14 | 41.30 | 2026-05-06 |
| MultiChallenge | Agentic | 29 | 36.35 | 2026-05-06 |
| Tau2-Bench Telecom | Agentic | 244 | 29.5% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 283 | 3.8% | 2026-05-11 |
| TextClass Benchmark | Classification | 17 | 1701.95 | 2026-05-06 |
| LiveCodeBench | Coding | 93 | 43.608% | 2026-05-28 |
| Natural2Code | Coding | 1 | 0.93 | 2026-05-06 |
| SciCode | Coding | 206 | 34% | 2026-05-11 |
| SciCode | Coding | 213 | 33.3% | 2026-05-11 |
| GSMA Open Telco Leaderboard | Domain | 27 | 61.40 | 2026-05-06 |
| K-12EduBench | Education | 15 | 64.05 | 2026-05-27 |
| RoboBench | Embodied | 4 | 45.04 | 2026-05-27 |
| BizFinBench | Finance | 6 | 69.75 | 2026-05-27 |
| CorpFin v2 | Finance | 107 | 33.722% | 2026-05-28 |
| MortgageTax | Finance | 45 | 59.658% | 2026-05-28 |
| TaxEval v2 | Finance | 87 | 65.25% | 2026-05-28 |
| HELM AIR-Bench | Generalization | 44 | 0.662188 | 2026-05-28 |
| HELM Safety | Generalization | 30 | 0.909540 | 2026-05-28 |
| LongBench v2 | Generalization | 14 | 51.1% | 2026-05-27 |
| HELM MedQA | Healthcare | 10 | 0.848907 | 2026-05-28 |
| MedAgentBench | Healthcare | 9 | 38.33% | 2026-05-27 |
| MedQA | Healthcare | 70 | 81.467% | 2026-04-16 |
| HUMAINE | Human Preference | 22 | 3.55 | 2026-05-06 |
| Artificial Analysis Intelligence Index | Intelligence | 267 | 18.51 | 2026-05-11 |
| Artificial Analysis Intelligence Index | Intelligence | 288 | 16.77 | 2026-05-11 |
| GPQA Diamond | Intelligence | 76 | 65.152% | 2026-05-28 |
| Humanity's Last Exam | Intelligence | 269 | 5.3% | 2026-05-11 |
| Humanity's Last Exam | Intelligence | 333 | 4.7% | 2026-05-11 |
| MathVista | Intelligence | 11 | 73.10 | 2026-05-06 |
| MMLU Pro | Intelligence | 77 | 77.376% | 2026-05-28 |
| MMLU-Pro | Intelligence | 140 | 78.2% | 2026-05-11 |
| MMLU-Pro | Intelligence | 143 | 77.9% | 2026-05-11 |
| MMMU Pro | Intelligence | 52 | 69.786% | 2026-05-28 |
| SuperGPQA | Intelligence | 10 | 47.73 | 2026-05-06 |
| LegalBench | Legal | 70 | 78.364% | 2026-05-28 |
| Fiction.LiveBench | Long Context | 19 | 37.50 | 2026-05-06 |
| AIME | Math | 73 | 29.792% | 2026-04-16 |
| AIME 2025 | Math | 206 | 21.7% | 2026-05-11 |
| IneqMath | Math | 36 | 3 | 2026-05-06 |
| MATH 500 | Math | 31 | 88% | 2026-01-09 |
| MGSM | Math | 52 | 89.018% | 2026-01-09 |
| HiddenMath | Mathematics | 1 | 0.63 | 2026-05-06 |
| OTIS Mock AIME 2024-2025 | Mathematics | 17 | 57.78 | 2026-05-06 |
| BRIDGE Medical Leaderboard | Medical | 4 | 53.33 | 2026-05-27 |
| BRIDGE Medical Leaderboard | Medical | 66 | 43.03 | 2026-05-27 |
| BRIDGE Medical Leaderboard | Medical | 76 | 41.98 | 2026-05-27 |
| LiveMedBench | Medical | 37 | 0.0271 | 2026-05-27 |
| MedHELM | Medical | 6 | 0.41964285714285715 | 2026-05-27 |
| MedSafe-Dx | Medical | 10 | 80 | 2026-05-27 |
| AfroBench-Lite | Multilingual | 7 | 66.43 | 2026-05-06 |
| LanguageBench | Multilingual | 1 | 0.69 | 2026-05-06 |
| Math-VR | Multimodal | 20 | 20.6 | 2026-05-27 |
| MMAU | Multimodal | 10 | 67.03 | 2026-05-06 |
| Vibe-Eval | Multimodal | 4 | 0.56 | 2026-05-06 |
| Visual-Language Understanding | Multimodal | 40 | 39.85 | 2026-05-06 |
| EnigmaEval | Reasoning | 39 | 0.63 | 2026-05-06 |
| GPQA Diamond | Reasoning | 255 | 63.6% | 2026-05-11 |
| GPQA Diamond | Reasoning | 262 | 62.3% | 2026-05-11 |
| SimpleBench | Reasoning | 15 | 30.70 | 2026-05-06 |
| CritPt | Science | 183 | 0% | 2026-05-11 |
| Defects4J | Software Engineering | 17 | 0.33 | 2026-05-27 |
| RepairBench | Software Engineering | 17 | 0.304 | 2026-05-27 |
| CoVoST2 | Speech | 1 | 0.39 | 2026-05-06 |
| StructEval | Structured Output | 7 | 62.55% | 2026-05-28 |
| Lech Mazur Writing | Writing | 20 | 7.38 | 2026-05-06 |
No matching rows.