Gemma 3 27B | BenchmarkList

Metadata

Gemma Closed/API

Aliases: gemma-3-27b-it, gemma-3-27b-it:free, google-gemma-3-27b-it, google/gemma-3-27b-it, google/gemma-3-27b-it:free

Benchmark	Category	Rank	Score	Sampled
Berkeley Function-Calling Leaderboard	Agentic	69	29.47%	2026-05-27
Clembench Multimodal v1.6.5	Agentic	6	61.39	2026-05-06
LLM-WikiRace	Agentic	16	30	2026-05-06
Tau2-Bench Telecom	Agentic	372	10.5%	2026-05-11
Terminal-Bench Hard	Agentic	284	3.8%	2026-05-11
OpenUGI	Alignment	1068	20.43	2026-05-06
Natural2Code	Coding	3	0.84	2026-05-06
SciCode	Coding	358	21.2%	2026-05-11
NeoEvalPlusN	Creative	134	10	2026-05-06
GSMA Open Telco Leaderboard	Domain	45	50.43	2026-05-06
kluster.ai LLM Hallucination Detection Leaderboard	Factuality	6	97.91	2026-05-06
Vectara HHEM Hallucination Leaderboard	Factuality	34	92.60	2026-05-06
BenchLM	General Knowledge	110	17	2026-05-06
Global-MMLU-Lite	General Knowledge	6	0.75	2026-05-06
Arena-Hard	Generalization	23	15.0%	2026-05-27
NeedleBench	Generalization	2	80.38%	2026-05-27
HealthBench Hard	Healthcare	2	0.59	2026-05-27
HUMAINE	Human Preference	19	3.57	2026-05-06
Artificial Analysis Intelligence Index	Intelligence	419	10.31	2026-05-11
FACTS Grounding	Intelligence	12	0.38	2026-05-06
Humanity's Last Exam	Intelligence	334	4.7%	2026-05-11
MathVision	Intelligence	66	46	2026-05-06
MMLU-Pro	Intelligence	244	66.9%	2026-05-11
HindiGen v1	Language	24	50.32	2026-05-06
Open Arabic LLM Leaderboard	Language	20	71.40	2026-05-06
PIQA	Language	7	83.70	2026-05-06
J1-ENVS	Legal	4	55.73	2026-05-26
AIME 2025	Math	209	20.7%	2026-05-11
HiddenMath	Mathematics	2	0.60	2026-05-06
OTIS Mock AIME 2024-2025	Mathematics	25	19.72	2026-05-06
BRIDGE Medical Leaderboard	Medical	20	49.45	2026-05-27
BRIDGE Medical Leaderboard	Medical	102	39.9	2026-05-27
BRIDGE Medical Leaderboard	Medical	140	37.55	2026-05-27
MEDIC Benchmark	Medical	66	59.16 average normalized public table score	2026-05-27
Medmarks	Medical	51	0.4610358850116875	2026-05-27
AfroBench-Lite	Multilingual	12	51.01	2026-05-06
LanguageBench	Multilingual	20	0.50	2026-05-06
ChartQA	Multimodal	21	0.78	2026-05-06
InfoVQA	Multimodal	6	0.71	2026-05-06
Artificial Analysis Openness Index	Openness	51	50	2026-05-11
BIG-Bench Extra Hard	Reasoning	5	0.19	2026-05-06
ECLeKTic	Reasoning	2	0.17	2026-05-06
GPQA Diamond	Reasoning	378	42.8%	2026-05-11
CritPt	Science	193	0%	2026-05-11
Defects4J	Software Engineering	34	0.184	2026-05-27
RepairBench	Software Engineering	34	0.173	2026-05-27
SWE-bench Pro	Software Engineering	10	11.38	2026-05-06
Structured Output Benchmark	Structured Output	17	84.70	2026-05-06
WMT24++	Translation	10	0.53	2026-05-06
Lech Mazur Writing	Writing	15	7.99	2026-05-06

Metadata

Benchmark Results