ParseBench

Document parsing benchmark for AI agents over enterprise documents, evaluating tables, charts, content faithfulness, semantic formatting, and visual grounding.

19rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

ParseBench score

Latest Results

Rows are ranked by the Hugging Face leaderboard API rank. Model display names are preserved from source modelId values.

Rank Subject ParseBench score Model Match Provenance Sampled
1 datalab-to/chandra-ocr-2 70.10 Imported 2026-05-06
2 google/gemma-4-31B-it 62.40 Gemma 4 31B
google-gemma-4-31b-it
Imported 2026-05-06
3 google/gemma-4-26B-A4B-it 58.50 Gemma 4 26B A4B
google-gemma-4-26b-a4b-it
Imported 2026-05-06
4 rednote-hilab/dots.mocr 55.80 Imported 2026-05-06
5 docling-project/docling-models 50.60 Imported 2026-05-06
6 lightonai/LightOnOCR-2-1B 48 Imported 2026-05-06
7 Qwen/Qwen3-VL-8B-Instruct 46.80 Qwen3 VL 8B Instruct
qwen-qwen3-vl-8b-instruct
Imported 2026-05-06
8 baidu/Qianfan-OCR 46.20 Imported 2026-05-06
9 opendatalab/MinerU2.5-2509-1.2B 45.90 Imported 2026-05-06
10 Qwen/Qwen3.6-35B-A3B 44.10 Qwen3.6 35B A3B
qwen-qwen3.6-35b-a3b
Imported 2026-05-06
11 deepseek-ai/DeepSeek-OCR-2 41.20 Imported 2026-05-06
12 PaddlePaddle/PaddleOCR-VL 40.90 Imported 2026-05-06
13 google/gemma-4-E4B-it 40.50 Imported 2026-05-06
14 ibm-granite/granite-vision-4.1-4b 39.45 Imported 2026-05-06
15 Qwen/Qwen3.5-4B 35.40 Imported 2026-05-06
16 Qwen/Qwen3.5-9B 31.90 Qwen3.5-9B
qwen-qwen3.5-9b
Imported 2026-05-06
17 zai-org/GLM-OCR 29.60 Imported 2026-05-06
18 Qwen/Qwen3.5-0.8B 28.40 Imported 2026-05-06
19 Qwen/Qwen3.5-2B 27.30 Imported 2026-05-06