ChartBench
ChartBench evaluates complex visual reasoning over charts across chart types and task combinations.
21rows
all_acc_plusprimary metric
2026-05-06sampled
Metadata
Metrics
Line Acc+, Line Acc, Line COR, Bar Acc+, Bar Acc, Bar COR, Pie Acc+, Pie Acc, Pie COR, Regular Acc+, Regular Acc, Regular COR, Area Acc+, Area Acc, Area COR, Box Acc+, Box Acc, Box COR, Radar Acc+, Radar Acc, Radar COR, Scatter Acc+, Scatter Acc, Scatter COR, Node Acc+, Node Acc, Node COR, Combination Acc+, Combination Acc, Combination COR, Extra Acc+, Extra Acc, Extra COR, All Acc+, All Acc, All COR
| Rank | Subject | All Acc+ | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-4O | 64.27 | GPT-4o openai-gpt-4o | Imported | 2026-05-06 |
| 2 | GPT-4V | 54.39 | GPT-4 openai-gpt-4 | Imported | 2026-05-06 |
| 3 | Internlm-XComposer-v2 | 51.34 | — | Imported | 2026-05-06 |
| 4 | ERNIE | 46.95 | — | Imported | 2026-05-06 |
| 5 | Mini-Gemini | 36.54 | — | Imported | 2026-05-06 |
| 6 | DocOwl-v1.5 | 31.62 | — | Imported | 2026-05-06 |
| 7 | Qwen-VL-Chat | 28.18 | — | Imported | 2026-05-06 |
| 8 | mPLUG-Owl-bloomz | 26.78 | — | Imported | 2026-05-06 |
| 9 | LLaVA-v1.5 | 26.39 | — | Imported | 2026-05-06 |
| 10 | MiniGPT-v2 | 23.55 | — | Imported | 2026-05-06 |
| 11 | ChartLlama | 22.26 | — | Imported | 2026-05-06 |
| 12 | BLIP2 | 20.24 | — | Imported | 2026-05-06 |
| 13 | CogAgent | 18.07 | — | Imported | 2026-05-06 |
| 14 | SPHINX | 17.89 | — | Imported | 2026-05-06 |
| 15 | Internlm-XComposer | 15.49 | — | Imported | 2026-05-06 |
| 16 | CogVLM | 13.30 | — | Imported | 2026-05-06 |
| 17 | InstructBLIP | 12.49 | — | Imported | 2026-05-06 |
| 18 | OneChart | 12.04 | — | Imported | 2026-05-06 |
| 19 | Shikra | 8.11 | — | Imported | 2026-05-06 |
| 20 | ChartVLM | 6.90 | — | Imported | 2026-05-06 |
| 21 | VisualGLM | 3.79 | — | Imported | 2026-05-06 |
No matching rows.