GDPval

Real-world, economically valuable knowledge work tasks across 44 occupations and 9 U.S. GDP sectors.

11rows
wins_or_tiesprimary metric
2026-04-23sampled

Metadata

Metrics

Wins/Ties vs Human

Showing 2 latest source slices.

Latest Results

Provider-published launch-post benchmark scores parsed from OpenAI's evaluation tables. Rows are marked self-reported and should be interpreted as source claims unless independently reproduced. OpenAI notes GPT evals were run with reasoning effort set to xhigh in a research environment.

Rank Subject Wins/Ties vs Human Model Match Provenance Sampled
1 GPT-5.5 84.9% GPT-5.5
openai-gpt-5.5
Launch post 2026-04-23
2 GPT-5.4 83% GPT-5.4
openai-gpt-5.4
Launch post 2026-04-23
3 GPT-5.5 Pro 82.3% GPT-5.5 Pro
openai-gpt-5.5-pro
Launch post 2026-04-23
4 GPT-5.4 Pro 82% GPT-5.4 Pro
openai-gpt-5.4-pro
Launch post 2026-04-23
5 Claude Opus 4.7 80.3% Claude Opus 4.7
anthropic-claude-opus-4.7
Launch post 2026-04-23
6 Gemini 3.1 Pro Preview 67.3% Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Launch post 2026-04-23
1 Claude Opus 4.1 47.6% Claude Opus 4.1
anthropic-claude-opus-4.1
Imported 2025-09-25
2 GPT-5 39.0% GPT-5
openai-gpt-5
Imported 2025-09-25
3 o3 35.2% o3
openai-o3
Imported 2025-09-25
4 o4-mini 29.1% o4 Mini
openai-o4-mini
Imported 2025-09-25
5 GPT-4o 12.5% GPT-4o
openai-gpt-4o
Imported 2025-09-25