Humanity's Last Exam (Text Only)
Humanity's Last Exam text-only leaderboard evaluates frontier LLMs using text-based expert questions, excluding multimodal content.
60rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Confidence Interval Upper, Max Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | gemini-3.1-pro-preview (thinking high) | 47.31 | Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview | Imported | 2026-05-06 |
| 1 | gpt-5.4-pro-2026-03-05 | 45.32 | GPT-5.4 Pro openai-gpt-5.4-pro | Imported | 2026-05-06 |
| 3 | Muse Spark | 40.92 | — | Imported | 2026-05-06 |
| 3 | gemini-3-pro-preview | 37.72 | Gemini 3 google-gemini-3 | Imported | 2026-05-06 |
| 4 | gpt-5.4-2026-03-05 (xhigh thinking) | 36.47 | GPT-5.4 openai-gpt-5.4 | Imported | 2026-05-06 |
| 4 | claude-opus-4-6-thinking-max | 36.24 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-06 |
| 5 | gpt-5-pro-2025-10-06 | 33.32 | GPT-5 Pro openai-gpt-5-pro | Imported | 2026-05-06 |
| 8 | gpt-5.2-2025-12-11 | 28.50 | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-06 |
| 8 | gpt-5-2025-08-07 | 26.32 | GPT-5 openai-gpt-5 | Imported | 2026-05-06 |
| 8 | claude-opus-4-5-20251101-thinking | 26.32 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-06 |
| 9 | gpt-5.1-thinking | 24.65 | GPT-5.1 openai-gpt-5.1 | Imported | 2026-05-06 |
| 11 | gemini-2.5-pro-preview-06-05 | 22.06 | Gemini 2.5 Pro Preview 06-05 google-gemini-2.5-pro-preview | Imported | 2026-05-06 |
| 12 | o3 (high) (April 2025) | 20.57 | o3 openai-o3 | Imported | 2026-05-06 |
| 12 | o3 (medium) (April 2025) | 19.78 | o3 openai-o3 | Imported | 2026-05-06 |
| 12 | gpt-5-mini-2025-08-07 | 19.74 | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-06 |
| 12 | claude-opus-4-6 (Non-Thinking) | 19.37 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-06 |
| 12 | o4-mini (high) (April 2025) | 18.90 | o4 Mini openai-o4-mini | Imported | 2026-05-06 |
| 13 | Gemini 2.5 Pro Experimental (March 2025) | 18.38 | — | Imported | 2026-05-06 |
| 13 | Gemini 2.5 Pro Preview (May 06 2025) | 18.38 | Gemini 2.5 Pro Preview 06-05 google-gemini-2.5-pro-preview | Imported | 2026-05-06 |
| 18 | gpt-oss-120b | 15.48 | gpt-oss-120b openai-gpt-oss-120b | Imported | 2026-05-06 |
| 18 | Qwen3-235B-A22B-Thinking-2507 | 15.43 | Qwen3 235B A22B Thinking 2507 qwen-qwen3-235b-a22b-thinking-2507 | Imported | 2026-05-06 |
| 20 | o4-mini (medium) (April 2025) | 14.53 | o4 Mini openai-o4-mini | Imported | 2026-05-06 |
| 20 | claude-sonnet-4-5-20250929-thinking | 14.09 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-06 |
| 20 | DeepSeek-R1-0528 | 14.04 | R1 0528 deepseek-deepseek-r1-0528 | Imported | 2026-05-06 |
| 20 | claude-opus-4-5-20251101 | 13.90 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-06 |
| 20 | o3 mini (high) | 13.37 | o3 Mini High openai-o3-mini-high | Imported | 2026-05-06 |
| 20 | DeepSeek-V3.1 | 12.88 | DeepSeek V3.1 Terminus deepseek-deepseek-v3.1-terminus | Imported | 2026-05-06 |
| 20 | Gemini 2.5 Flash (April 2025) | 12.58 | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-06 |
| 22 | Qwen3-235B-A22B | 11.75 | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-06 |
| 24 | claude-opus-4-1-20250805-thinking | 11.26 | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-06 |
| 26 | Claude Opus 4 (Thinking) | 10.80 | Claude Opus 4 anthropic-claude-opus-4 | Imported | 2026-05-06 |
| 26 | Gemini 2.5 Flash Preview (May 2025) | 10.72 | — | Imported | 2026-05-06 |
| 27 | o3 mini (medium) | 10.31 | o3-mini openai-o3-mini | Imported | 2026-05-06 |
| 29 | gpt-oss-20b | 9.73 | gpt-oss-20b openai-gpt-oss-20b | Imported | 2026-05-06 |
| 29 | glm-4p5 | 9.64 | GLM 4.5 z-ai-glm-4.5 | Imported | 2026-05-06 |
| 29 | glm-4p5-air | 9.41 | GLM 4.5 Air z-ai-glm-4.5-air | Imported | 2026-05-06 |
| 31 | DeepSeek R1 | 8.54 | R1 deepseek-r1 | Imported | 2026-05-06 |
| 33 | gemini-3.1-flash-lite-preview | 8.02 | Gemini 3.1 Flash Lite Preview google-gemini-3.1-flash-lite-preview | Imported | 2026-05-06 |
| 33 | Claude 3.7 Sonnet (Thinking) | 7.89 | Claude 3.7 Sonnet (thinking) anthropic-claude-3.7-sonnet-thinking | Imported | 2026-05-06 |
| 34 | o1 (December 2024) | 7.75 | o1 openai-o1 | Imported | 2026-05-06 |
| 34 | o1 Pro | 7.71 | o1-pro openai-o1-pro | Imported | 2026-05-06 |
| 34 | claude-sonnet-4-5-20250929 | 7.65 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-06 |
| 34 | Claude Sonnet 4 (Thinking)\n | 7.60 | — | Imported | 2026-05-06 |
| 35 | claude-opus-4-1-20250805 | 7.37 | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-06 |
| 37 | Gemini 2.0 Flash Thinking (January 2025) | 6.55 | — | Imported | 2026-05-06 |
| 37 | gpt-5.1-instant | 6.49 | GPT-5.1 Chat openai-gpt-5.1-chat | Imported | 2026-05-06 |
| 38 | Claude Opus 4 \n | 6.26 | — | Imported | 2026-05-06 |
| 39 | GPT 4.5 Preview | 5.80 | GPT-4.5 openai-gpt-4.5-preview | Imported | 2026-05-06 |
| 44 | Claude Sonnet 4 | 5.42 | Claude Sonnet 4 anthropic-claude-sonnet-4 | Imported | 2026-05-06 |
| 44 | Llama 4 Maverick | 5.34 | Llama 4 Maverick meta-llama-4-maverick | Imported | 2026-05-06 |
| 45 | GPT-4.1 | 4.97 | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-06 |
| 45 | kimi-k2-instruct | 4.68 | MoonshotAI: Kimi K2 0711 moonshotai-kimi-k2 | Imported | 2026-05-06 |
| 47 | DeepSeek V3 (March 2025) | 4.55 | DeepSeek V3 deepseek-deepseek-chat | Imported | 2026-05-06 |
| 47 | Gemini-1.5-Pro-002 | 4.55 | — | Imported | 2026-05-06 |
| 47 | Nova Micro | 4.41 | Nova Micro 1.0 amazon-nova-micro-v1 | Imported | 2026-05-06 |
| 48 | Mistral Medium 3 | 4.36 | Mistral: Mistral Medium 3 mistralai-mistral-medium-3 | Imported | 2026-05-06 |
| 48 | Claude 3.5 Sonnet (October 2024) | 4.32 | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-06 |
| 48 | Nova Pro | 4.32 | Nova Pro 1.0 amazon-nova-pro-v1 | Imported | 2026-05-06 |
| 49 | Nova Lite | 3.76 | Nova Lite 1.0 amazon-nova-lite-v1 | Imported | 2026-05-06 |
| 59 | GPT-4o (November 2024) | 2.32 | GPT-4o openai-gpt-4o | Imported | 2026-05-06 |
No matching rows.