GPT-4.1 Nano

GPT / OpenAI

54scores
53benchmarks
$0.1 / $0.4 per 1M tokenscost in/out

Metadata

GPT Closed/API

Aliases: gpt-4.1-nano, gpt-4.1-nano-2025-04-14, openai-gpt-4.1-nano, openai-gpt-4.1-nano-2025-04-14, openai/gpt-4.1-nano, openai/gpt-4.1-nano-2025-04-14

Benchmark Results

Benchmark Category Rank Score Sampled
ARC-AGI-1 Agentic 143 0 2026-05-05
ARC-AGI-2 Agentic 135 0 2026-05-05
Berkeley Function-Calling Leaderboard Agentic 58 33.05% 2026-05-27
Berkeley Function-Calling Leaderboard Agentic 90 24.88% 2026-05-27
Galileo Agent Leaderboard Agentic 14 0.38 2026-05-06
Hindsight LLM Memory Leaderboard Agentic 2 87.20 2026-05-06
MCPMark Agentic 39 0 2026-05-06
RealDataAgentBench Agentic 12 0.62 2026-04-28
Tau2-Bench Telecom Agentic 338 17.3% 2026-05-11
Terminal-Bench Hard Agentic 286 3.8% 2026-05-11
TextClass Benchmark Classification 61 1533.06 2026-05-06
BigCodeBench-Hard Coding 22 28.40 2026-05-05
LiveCodeBench Coding 96 42.718% 2026-05-28
SciCode Coding 313 25.9% 2026-05-11
GSMA Open Telco Leaderboard Domain 50 48.28 2026-05-06
CorpFin v2 Finance 97 42.075% 2026-05-28
MortgageTax Finance 60 52.822% 2026-05-28
TaxEval v2 Finance 98 60.752% 2026-05-28
BenchLM General Knowledge 95 27 2026-05-06
Arena-Hard Generalization 25 13.7% 2026-05-27
HELM AIR-Bench Generalization 55 0.615297 2026-05-28
HELM Safety Generalization 20 0.937650 2026-05-28
MedQA Healthcare 83 68.225% 2026-04-16
Multi-IF Instruction Following 20 0.57 2026-05-06
Artificial Analysis Intelligence Index Intelligence 366 13.04 2026-05-11
GPQA Diamond Intelligence 93 50.758% 2026-05-28
Humanity's Last Exam Intelligence 425 3.9% 2026-05-11
MMLU Pro Intelligence 102 63.479% 2026-05-28
MMLU-Pro Intelligence 249 65.7% 2026-05-11
MMMU Pro Intelligence 69 55.055% 2026-05-28
SimpleQA Intelligence 23 7.6% 2026-05-27
HindiGen v1 Language 20 56.89 2026-05-06
LegalBench Legal 103 61.056% 2026-05-28
LEXam Legal 20 43.68% open / 39.22% MCQ 2026-05-28
Graphwalks BFS >128k Long Context 7 0.03 2026-05-06
Graphwalks parents >128k Long Context 6 0.06 2026-05-06
OpenAI-MRCR: 2 needle 128k Long Context 7 0.37 2026-05-06
OpenAI-MRCR: 2 needle 1M Long Context 5 0.12 2026-05-06
AIME Math 76 26.458% 2026-04-16
AIME 2025 Math 201 24% 2026-05-11
MATH 500 Math 39 80.2% 2026-01-09
MGSM Math 74 69.273% 2026-01-09
LanguageBench Multilingual 15 0.52 2026-05-06
CharXiv-D Multimodal 13 0.74 2026-05-06
CharXiv-R Multimodal 33 0.41 2026-05-06
Design Arena Multimodal 111 1021 2026-05-06
Math-VR Multimodal 26 9.1 2026-05-27
Visual-Language Understanding Multimodal 57 26.55 2026-05-06
GPQA Diamond Reasoning 339 51.2% 2026-05-11
Graphwalks BFS <128k Reasoning 11 0.25 2026-05-06
Graphwalks parents <128k Reasoning 11 0.09 2026-05-06
CritPt Science 215 0% 2026-05-11
ComplexFuncBench Tool Use 6 0.06 2026-05-06
COLLIE Writing 9 0.42 2026-05-06