ChartQA

ChartQA is a large-scale benchmark comprising 9.6K human-written questions and 23.1K questions generated from human-written chart summaries, designed to evaluate models' abilities in visual and logical reasoning over charts.

24rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Normalized Score

Latest Results

Rank Subject Score Model Match Provenance Sampled
1 Claude 3.5 Sonnet 0.91 Claude 3.5 Sonnet
anthropic-claude-3.5-sonnet
Self-reported 2026-05-06
2 Llama 4 Maverick 0.90 Llama 4 Maverick
meta-llama-4-maverick
Self-reported 2026-05-06
3 Qwen2.5 VL 72B Instruct 0.90 Qwen2.5 VL 72B Instruct
qwen-qwen2.5-vl-72b-instruct
Self-reported 2026-05-06
4 Nova Pro 0.89 Nova Pro 1.0
amazon-nova-pro-v1
Self-reported 2026-05-06
5 Llama 4 Scout 0.89 Llama 4 Scout
meta-llama-llama-4-scout
Self-reported 2026-05-06
6 Qwen2-VL-72B-Instruct 0.88 Self-reported 2026-05-06
7 Pixtral Large 0.88 Self-reported 2026-05-06
8 Mistral Small 3.2 24B Instruct 0.87 Mistral: Mistral Small 3.2 24B
mistralai-mistral-small-3.2-24b-instruct
Self-reported 2026-05-06
9 Qwen2.5 VL 7B Instruct 0.87 Self-reported 2026-05-06
10 Nova Lite 0.87 Nova Lite 1.0
amazon-nova-lite-v1
Self-reported 2026-05-06
11 DeepSeek VL2 0.86 Self-reported 2026-05-06
12 GPT-4o 0.86 GPT-4o (2024-08-06)
openai-gpt-4o-2024-08-06
Self-reported 2026-05-06
13 Llama 3.2 90B Instruct 0.85 Self-reported 2026-05-06
14 Qwen2.5-Omni-7B 0.85 Self-reported 2026-05-06
15 DeepSeek VL2 Small 0.84 Self-reported 2026-05-06
16 Llama 3.2 11B Instruct 0.83 Self-reported 2026-05-06
17 Pixtral-12B 0.82 Self-reported 2026-05-06
17 Phi-3.5-vision-instruct 0.82 Self-reported 2026-05-06
19 Phi-4-multimodal-instruct 0.81 Self-reported 2026-05-06
20 DeepSeek VL2 Tiny 0.81 Self-reported 2026-05-06
21 Gemma 3 27B 0.78 Gemma 3 27B
google-gemma-3-27b-it
Self-reported 2026-05-06
22 Grok-1.5V 0.76 Self-reported 2026-05-06
23 Gemma 3 12B 0.76 Gemma 3 12B
google-gemma-3-12b-it
Self-reported 2026-05-06
24 Gemma 3 4B 0.69 Gemma 3 4B
google-gemma-3-4b-it
Self-reported 2026-05-06