OmniDocBench 1.5

OmniDocBench 1.5 is a comprehensive benchmark for evaluating multimodal large language models on document understanding tasks, including OCR, document parsing, information extraction, and visual question answering across diverse document types. Lower Overall Edit Distance scores are better.

11rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Normalized Score

Latest Results

Rank Subject Score Model Match Provenance Sampled
1 Qwen3.6 Plus 0.91 Qwen3.6 Plus
qwen-qwen3.6-plus
Self-reported 2026-05-06
2 Qwen3.6-35B-A3B 0.90 Qwen3.6 35B A3B
qwen-qwen3.6-35b-a3b
Self-reported 2026-05-06
3 Qwen3.5-122B-A10B 0.90 Qwen3.5-122B-A10B
qwen-qwen3.5-122b-a10b
Self-reported 2026-05-06
4 Qwen3.5-35B-A3B 0.89 Qwen3.5-35B-A3B
qwen-qwen3.5-35b-a3b
Self-reported 2026-05-06
5 GPT-5.4 0.89 GPT-5.4
openai-gpt-5.4
Self-reported 2026-05-06
6 Qwen3.5-27B 0.89 Qwen3.5-27B
qwen-qwen3.5-27b
Self-reported 2026-05-06
7 Kimi K2.5 0.89 KIMI MoonshotAI: Kimi K2.5
moonshotai-kimi-k2.5
Self-reported 2026-05-06
8 GPT-5.4 mini 0.87 GPT-5.4 Mini
openai-gpt-5.4-mini
Self-reported 2026-05-06
9 GPT-5.4 nano 0.76 GPT-5.4 Nano
openai-gpt-5.4-nano
Self-reported 2026-05-06
10 Gemini 3 Flash 0.12 Gemini 3 Flash Preview
google-gemini-3-flash-preview
Self-reported 2026-05-06
11 Gemini 3 Pro 0.12 Gemini 3
google-gemini-3
Self-reported 2026-05-06