Visual-Language Understanding
Scale’s SEAL Leaderboard evaluates top models’ visual-language understanding, testing perception, logic, calculation, and common sense.
63rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Confidence Interval Upper, Max Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Gemini 2.5 Pro Experimental (March 2025) | 54.65 | — | Imported | 2026-05-06 |
| 1 | gemini-2.5-pro-preview-06-05 | 54.63 | Gemini 2.5 Pro Preview 06-05 google-gemini-2.5-pro-preview | Imported | 2026-05-06 |
| 1 | gpt-5.4-pro-2026-03-05 | 53.89 | GPT-5.4 Pro openai-gpt-5.4-pro | Imported | 2026-05-06 |
| 2 | gpt-5-pro-2025-10-06 | 52.39 | GPT-5 Pro openai-gpt-5-pro | Imported | 2026-05-06 |
| 3 | o4-mini (high) (April 2025) | 51.79 | o4 Mini openai-o4-mini | Imported | 2026-05-06 |
| 3 | o4-mini (medium) (April 2025) | 51.66 | o4 Mini openai-o4-mini | Imported | 2026-05-06 |
| 3 | o3 Pro (high) (June 2025) | 51.63 | o3 Pro openai-o3-pro | Imported | 2026-05-06 |
| 3 | gemini-3-pro-preview | 51.49 | Gemini 3 google-gemini-3 | Imported | 2026-05-06 |
| 3 | gpt-5.4-2026-03-05 (xhigh thinking) | 50.89 | GPT-5.4 openai-gpt-5.4 | Imported | 2026-05-06 |
| 3 | Gemini 2.5 Pro Preview (May 06 2025) | 50.78 | Gemini 2.5 Pro Preview 06-05 google-gemini-2.5-pro-preview | Imported | 2026-05-06 |
| 3 | gpt-5-mini-2025-08-07 | 50.39 | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-06 |
| 6 | o3 (high) (April 2025) | 50.07 | o3 openai-o3 | Imported | 2026-05-06 |
| 7 | gpt-5-2025-08-07 | 49.69 | GPT-5 openai-gpt-5 | Imported | 2026-05-06 |
| 9 | o3 (medium) (April 2025) | 49.59 | o3 openai-o3 | Imported | 2026-05-06 |
| 9 | claude-sonnet-4-5-20250929-thinking | 48.75 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-06 |
| 9 | gemini-3.1-flash-lite-preview | 46.93 | Gemini 3.1 Flash Lite Preview google-gemini-3.1-flash-lite-preview | Imported | 2026-05-06 |
| 10 | Gemini 2.5 Flash Preview (May 2025) | 49.15 | — | Imported | 2026-05-06 |
| 11 | o1 Pro (March 2025) | 47.32 | o1-pro openai-o1-pro | Imported | 2026-05-06 |
| 12 | Claude 3.7 Sonnet Thinking (Feb 2025) | 48.23 | Claude 3.7 Sonnet (thinking) anthropic-claude-3.7-sonnet-thinking | Imported | 2026-05-06 |
| 13 | claude-opus-4-1-20250805-thinking | 48.44 | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-06 |
| 13 | gpt-5.2-2025-12-11 | 46.62 | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-06 |
| 15 | Gemini 2.5 Flash (April 2025) | 46.97 | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-06 |
| 17 | Claude Opus 4 (Thinking) | 46.96 | Claude Opus 4 anthropic-claude-opus-4 | Imported | 2026-05-06 |
| 19 | claude-opus-4-5-20251101-thinking | 46.43 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-06 |
| 19 | Gemini 2.0 Flash Thinking Experimental | 45.50 | — | Imported | 2026-05-06 |
| 20 | claude-opus-4-6-thinking-max | 46.07 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-06 |
| 20 | GPT-4.1 | 45.34 | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-06 |
| 20 | kimi-k2.5 | 41.86 | MoonshotAI: Kimi K2.5 moonshotai-kimi-k2.5 | Imported | 2026-05-06 |
| 21 | claude-opus-4-6 (Non-Thinking) | 45.48 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-06 |
| 21 | claude-opus-4-1-20250805 | 45.25 | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-06 |
| 22 | Claude Sonnet 4 (Thinking) | 45.49 | Claude Sonnet 4 anthropic-claude-sonnet-4 | Imported | 2026-05-06 |
| 23 | o1 (December 2024) | 45.25 | o1 openai-o1 | Imported | 2026-05-06 |
| 24 | claude-opus-4-5-20251101 | 45.32 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-06 |
| 26 | claude-sonnet-4-5-20250929 | 45 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-06 |
| 26 | gpt-5.1-thinking | 43.82 | GPT-5.1 openai-gpt-5.1 | Imported | 2026-05-06 |
| 29 | Claude Opus 4 | 43.53 | Claude Opus 4 anthropic-claude-opus-4 | Imported | 2026-05-06 |
| 30 | Gemini 2.0 Pro Experimental (Feb 2025) | 43.25 | — | Imported | 2026-05-06 |
| 34 | Claude Sonnet 4 | 43.21 | Claude Sonnet 4 anthropic-claude-sonnet-4 | Imported | 2026-05-06 |
| 34 | Claude 3.7 Sonnet (February 2025) | 43.02 | Claude 3.7 Sonnet anthropic-claude-3.7-sonnet | Imported | 2026-05-06 |
| 34 | GPT-4.5 Preview (February 2025) | 42.11 | GPT-4.5 openai-gpt-4.5-preview | Imported | 2026-05-06 |
| 39 | GPT-4.1 mini | 41.14 | GPT-4.1 Mini openai-gpt-4.1-mini | Imported | 2026-05-06 |
| 39 | Gemini 2.0 Flash Experimental | 39.95 | — | Imported | 2026-05-06 |
| 40 | Gemini 2.0 Flash (February 2025) | 39.85 | Gemini 2.0 Flash google-gemini-2.0-flash | Imported | 2026-05-06 |
| 41 | Claude 3.5 Sonnet (October 2024) | 38.72 | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-06 |
| 43 | Claude 3.5 Sonnet (June 2024) | 38.37 | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-06 |
| 43 | Llama 4 Maverick | 38.33 | Llama 4 Maverick meta-llama-4-maverick | Imported | 2026-05-06 |
| 43 | ChatGPT-4o-latest (November 2024) | 37.99 | — | Imported | 2026-05-06 |
| 43 | Gemini 1.5 Pro | 37.07 | — | Imported | 2026-05-06 |
| 48 | gpt-5.1-instant | 34.87 | GPT-5.1 Chat openai-gpt-5.1-chat | Imported | 2026-05-06 |
| 49 | GPT-4o (August 2024) | 34.94 | GPT-4o openai-gpt-4o | Imported | 2026-05-06 |
| 49 | Mistral Medium 3 | 34.59 | Mistral: Mistral Medium 3 mistralai-mistral-medium-3 | Imported | 2026-05-06 |
| 49 | Gemini 1.5 Flash 002 | 34.03 | — | Imported | 2026-05-06 |
| 50 | Pixtral Large (November 2024) | 33.89 | — | Imported | 2026-05-06 |
| 50 | Gemini 2.0 Flash Lite Preview | 32.69 | — | Imported | 2026-05-06 |
| 55 | Qwen2-VL-72B-Instruct | 28.56 | — | Imported | 2026-05-06 |
| 55 | Claude 3 Opus | 27.82 | — | Imported | 2026-05-06 |
| 57 | GPT-4.1 nano | 26.55 | GPT-4.1 Nano openai-gpt-4.1-nano | Imported | 2026-05-06 |
| 57 | Nova Pro | 26.27 | Nova Pro 1.0 amazon-nova-pro-v1 | Imported | 2026-05-06 |
| 57 | Pixtral 12B (September 2024) | 25.97 | — | Imported | 2026-05-06 |
| 57 | Nova Lite | 25.50 | Nova Lite 1.0 amazon-nova-lite-v1 | Imported | 2026-05-06 |
| 59 | Llama 3.2 90B Vision Instruct | 24.61 | — | Imported | 2026-05-06 |
| 62 | Llama 3.2 11B Vision-Instruct | 20.47 | Llama 3.2 11B Vision Instruct meta-llama-llama-3.2-11b-vision-instruct | Imported | 2026-05-06 |
| 63 | Phi 3.5 Vision-Instruct | 15.18 | — | Imported | 2026-05-06 |
No matching rows.