GPT-5 Nano | BenchmarkList

Metadata

GPT Closed/API

Aliases: gpt-5-nano, gpt-5-nano-2025-08-07, openai-gpt-5-nano, openai-gpt-5-nano-2025-08-07, openai/gpt-5-nano, openai/gpt-5-nano-2025-08-07

Benchmark	Category	Rank	Score	Sampled
ARC-AGI-1	Agentic	111	20.71	2026-05-05
ARC-AGI-1	Agentic	115	16.67	2026-05-05
ARC-AGI-1	Agentic	139	4.04	2026-05-05
ARC-AGI-1	Agentic	141	1.50	2026-05-05
ARC-AGI-2	Agentic	84	2.61	2026-05-05
ARC-AGI-2	Agentic	118	0.88	2026-05-05
ARC-AGI-2	Agentic	145	0	2026-05-05
ARC-AGI-2	Agentic	146	0	2026-05-05
Berkeley Function-Calling Leaderboard	Agentic	24	51.45%	2026-05-27
Berkeley Function-Calling Leaderboard	Agentic	79	27.55%	2026-05-27
Hindsight LLM Memory Leaderboard	Agentic	13	83.90	2026-05-06
LLM-WikiRace	Agentic	17	24.70	2026-05-06
MCPMark	Agentic	34	0.06	2026-05-06
MCPMark	Agentic	35	0.05	2026-05-06
MCPMark	Agentic	37	0.04	2026-05-06
PinchBench	Agentic	58	0.69	2026-05-06
Tau2-Bench Telecom	Agentic	211	36.5%	2026-05-11
Tau2-Bench Telecom	Agentic	242	30.4%	2026-05-11
Tau2-Bench Telecom	Agentic	274	25.7%	2026-05-11
Terminal-Bench Hard	Agentic	159	17.4%	2026-05-11
Terminal-Bench Hard	Agentic	194	12.1%	2026-05-11
Terminal-Bench Hard	Agentic	235	6.8%	2026-05-11
ALE-Bench	Coding	49	718.67	2026-05-06
LiveCodeBench	Coding	63	70.216%	2026-05-28
MLX Benchmark V2	Coding	11	41.92	2026-05-06
SciCode	Coding	165	36.6%	2026-05-11
SciCode	Coding	208	33.8%	2026-05-11
SciCode	Coding	263	29.1%	2026-05-11
GSMA Open Telco Leaderboard	Domain	31	60.30	2026-05-06
SAGE	Education	48	30.377%	2026-05-28
Vectara HHEM Hallucination Leaderboard	Factuality	58	89.50	2026-05-06
MortgageTax	Finance	58	53.617%	2026-05-28
TaxEval v2	Finance	80	67.376%	2026-05-28
MageBench Season 1	Game	34	1499 rating / 14 games	2026-05-28
Xent Games	Game	12	23.27 overall	2026-05-28
ALL Bench LLM	General Knowledge	39	0	2026-05-06
HELM AIR-Bench	Generalization	6	0.878205	2026-05-28
MedCode	Healthcare	53	30.441%	2026-05-28
MedQA	Healthcare	23	93.258%	2026-04-16
MedScribe	Healthcare	42	72.865%	2026-05-28
AI2D	Intelligence	6	81.9	2026-05-27
Artificial Analysis Intelligence Index	Intelligence	176	26.83	2026-05-11
Artificial Analysis Intelligence Index	Intelligence	183	25.88	2026-05-11
Artificial Analysis Intelligence Index	Intelligence	352	13.84	2026-05-11
GPQA Diamond	Intelligence	79	63.384%	2026-05-28
Humanity's Last Exam	Intelligence	183	8.2%	2026-05-11
Humanity's Last Exam	Intelligence	197	7.6%	2026-05-11
Humanity's Last Exam	Intelligence	402	4.1%	2026-05-11
MMBench	Intelligence	7	80.3	2026-05-27
MMLU Pro	Intelligence	80	76.067%	2026-05-28
MMLU-Pro	Intelligence	142	78%	2026-05-11
MMLU-Pro	Intelligence	154	77.2%	2026-05-11
MMLU-Pro	Intelligence	283	55.6%	2026-05-11
MMMU Pro	Intelligence	49	70.942%	2026-05-28
RealWorldQA	Intelligence	6	71.8	2026-05-27
Seneca-TRBench	Language	2	92.90	2026-05-06
CaseLaw v2	Legal	44	52.626%	2026-05-04
LegalBench	Legal	109	50.129%	2026-05-28
LEXam	Legal	31	27.25% open / 47.11% MCQ	2026-05-28
AIME	Math	45	81.181%	2026-04-16
AIME 2025	Math	53	83.7%	2026-05-11
AIME 2025	Math	72	78.3%	2026-05-11
AIME 2025	Math	190	27.3%	2026-05-11
MATH 500	Math	16	93.8%	2026-01-09
MGSM	Math	48	89.309%	2026-01-09
ProofBench	Math	22	12%	2026-05-28
HMMT 2025	Mathematics	25	0.76	2026-05-06
ALL Bench Multimodal	Multimodal	41	0	2026-05-06
ALL Bench Multimodal	Multimodal	5	50.88	2026-05-06
Design Arena	Multimodal	84	1143	2026-05-06
IDP Leaderboard	Multimodal	22	54.81	2026-05-06
Artificial Analysis Openness Index	Openness	225	5.56	2026-05-11
Artificial Analysis Openness Index	Openness	226	5.56	2026-05-11
Artificial Analysis Openness Index	Openness	227	5.56	2026-05-11
CAIS Text Capabilities Index	Reasoning	39	4.8	2026-05-27
GPQA Diamond	Reasoning	225	67.6%	2026-05-11
GPQA Diamond	Reasoning	232	67%	2026-05-11
GPQA Diamond	Reasoning	379	42.8%	2026-05-11
CAIS Risk Index	Safety	19	51.9	2026-05-27
ChemBench	Science	8	0.63	2026-05-06
CritPt	Science	222	0%	2026-05-11
CritPt	Science	223	0%	2026-05-11
CritPt	Science	224	0%	2026-05-11
CAIS Vision Capabilities Index	Vision	26	43.4	2026-05-27

Metadata

Benchmark Results