VPCT

Visual perception and comprehension tasks for evaluating multimodal model reasoning.

11rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Standard error (lower is better)

Latest Results

Rows parsed from the public leaderboard table.

Rank Subject Score Model Match Provenance Sampled
1 Gemini 3 Pro 91 Gemini 3
google-gemini-3
Imported 2026-05-06
2 GPT-5.2 84 GPT-5.2
openai-gpt-5.2
Imported 2026-05-06
3 o4-mini-2025-04-16 medium 57.50 o4 Mini
openai-o4-mini
Imported 2026-05-06
4 o3 52 o3
openai-o3
Imported 2026-05-06
5 Gemini 2.5 Pro (Jun 2025) 48 Gemini 2.5 Pro
google-gemini-2.5-pro
Imported 2026-05-06
6 GPT-4.1 45 GPT-4.1
openai-gpt-4.1
Imported 2026-05-06
7 GPT-4o 40 GPT-4o
openai-gpt-4o
Imported 2026-05-06
8 Claude Opus 4.5 40 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-06
9 Claude 3.7 Sonnet 39 Claude 3.7 Sonnet
anthropic-claude-3.7-sonnet
Imported 2026-05-06
10 o1 37 o1
openai-o1
Imported 2026-05-06
11 gpt-4o-mini-2024-07-18 34 GPT-4o-mini (2024-07-18)
openai-gpt-4o-mini-2024-07-18
Imported 2026-05-06