GPT-5.4 Nano

GPT / OpenAI

81scores
54benchmarks
$0.2 / $1.25 per 1M tokenscost in/out

Metadata

GPT Closed/API

Aliases: gpt-5.4-nano, gpt-5.4-nano-20260317, openai-gpt-5.4-nano, openai-gpt-5.4-nano-20260317, openai/gpt-5.4-nano, openai/gpt-5.4-nano-20260317

Benchmark Results

Benchmark Category Rank Score Sampled
APEX-Agents-AA Agentic 8 24.9% 2026-05-11
ARC-AGI-1 Agentic 64 51.50 2026-05-05
ARC-AGI-1 Agentic 80 38.17 2026-05-05
ARC-AGI-1 Agentic 89 33 2026-05-05
ARC-AGI-1 Agentic 112 18.33 2026-05-05
ARC-AGI-2 Agentic 64 5.69 2026-05-05
ARC-AGI-2 Agentic 78 3.61 2026-05-05
ARC-AGI-2 Agentic 99 1.94 2026-05-05
ARC-AGI-2 Agentic 105 1.53 2026-05-05
AutoBench Agentic 23 2.78 2026-05-06
Hindsight LLM Memory Leaderboard Agentic 14 83.90 2026-05-06
ITBench-AA Agentic 20 24.4% 2026-05-28
OSWorld-Verified Agentic 12 0.39 2026-05-06
PinchBench Agentic 42 0.79 2026-05-06
RuneBench Agentic 13 2.30 2026-05-05
Tau2-Bench Telecom Agentic 116 76% 2026-05-11
Tau2-Bench Telecom Agentic 173 52.6% 2026-05-11
Tau2-Bench Telecom Agentic 217 34.8% 2026-05-11
Terminal-Bench Hard Agentic 31 42.4% 2026-05-11
Terminal-Bench Hard Agentic 76 33.3% 2026-05-11
Terminal-Bench Hard Agentic 125 24.2% 2026-05-11
Toolathlon Agentic 14 0.35 2026-05-06
ALE-Bench Coding 27 1004.52 2026-05-06
IOI Coding 22 15.25% 2026-05-26
LiveCodeBench Coding 25 84.009% 2026-05-28
MLX Benchmark V2 Coding 5 75.19 2026-05-06
SciCode Coding 31 46.9% 2026-05-11
SciCode Coding 131 38.4% 2026-05-11
SciCode Coding 191 35.2% 2026-05-11
SWE-bench Verified Coding 33 69.8% 2026-05-28
Terminal-Bench 2.0 Coding 33 39.888% 2026-05-28
Vibe Code Bench v1.1 Coding 17 26.097% 2026-05-28
OmniDocBench 1.5 Document Understanding 9 0.76 2026-05-06
SAGE Education 34 38.081% 2026-05-28
Vectara HHEM Hallucination Leaderboard Factuality 2 96.90 2026-05-06
CorpFin v2 Finance 37 61.189% 2026-05-28
Finance Agent v1.1 Finance 30 47.801% 2026-05-04
Finance Agent v2 Finance 14 38.217% 2026-05-28
MortgageTax Finance 46 59.102% 2026-05-28
TaxEval v2 Finance 79 67.416% 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 10 1282.53 Elo / 18 games 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 32 974.38 Elo / 126 games 2026-05-28
InfiniteBM Liar's Dice Game 11 1304.64 Elo / 40 games 2026-05-28
InfiniteBM Liar's Dice Game 37 795.51 Elo / 130 games 2026-05-28
InfiniteBM Werewolf Game 9 902.42 Elo / 6 games 2026-05-28
MedCode Healthcare 25 41.029% 2026-05-28
MedScribe Healthcare 31 77.09% 2026-05-28
Artificial Analysis Intelligence Index Intelligence 47 43.98 2026-05-11
Artificial Analysis Intelligence Index Intelligence 90 38.11 2026-05-11
Artificial Analysis Intelligence Index Intelligence 201 24.36 2026-05-11
GPQA Diamond Intelligence 49 77.526% 2026-05-28
Humanity's Last Exam Intelligence 40 26.5% 2026-05-11
Humanity's Last Exam Intelligence 98 14.7% 2026-05-11
Humanity's Last Exam Intelligence 391 4.2% 2026-05-11
LiveBench Intelligence 26 71.31 2026-05-05
LiveBench Intelligence 47 63.64 2026-05-05
MMLU Pro Intelligence 79 77.172% 2026-05-28
MMMU Pro Intelligence 39 73.584% 2026-05-28
Vals Index Intelligence 15 46.461% 2026-05-28
Vals Multimodal Index Intelligence 11 47.484% 2026-05-28
CaseLaw v2 Legal 46 51.875% 2026-05-04
LegalBench Legal 72 77.92% 2026-05-28
MRCR v2 (8-needle) Long Context 5 0.33 2026-05-06
AIME Math 29 88.75% 2026-04-16
ProofBench Math 30 5% 2026-05-28
CAIS Text Capabilities Index Reasoning 27 17.9 2026-05-27
Context Arena Reasoning 24 48.78 2026-05-06
Context Arena Reasoning 37 36.17 2026-05-06
Context Arena Reasoning 46 29.90 2026-05-06
Context Arena Reasoning 62 21.87 2026-05-06
Context Arena Reasoning 70 12.31 2026-05-06
GPQA Diamond Reasoning 88 81.7% 2026-05-11
GPQA Diamond Reasoning 148 76.1% 2026-05-11
GPQA Diamond Reasoning 309 55.8% 2026-05-11
Graphwalks BFS <128k Reasoning 5 0.73 2026-05-06
Graphwalks parents <128k Reasoning 9 0.51 2026-05-06
CAIS Risk Index Safety 16 48.7 2026-05-27
CritPt Science 17 9.3% 2026-05-11
CritPt Science 34 5.1% 2026-05-11
CritPt Science 228 0% 2026-05-11
CAIS Vision Capabilities Index Vision 25 44.7 2026-05-27