DUE Benchmark

End-to-end document understanding benchmark suite with a public evaluator and task-specific document AI metrics.

4rows
overallprimary metric
2026-05-27sampled

Metadata

Metrics

Overall, Overall Std (lower is better), QA, QA Std (lower is better), KIE, KIE Std (lower is better), Table, Table Std (lower is better), DocVQA, DocVQA Std (lower is better), Infographics, Infographics Std (lower is better), DeepForm, DeepForm Std (lower is better), Papers with Code, Papers with Code Std (lower is better), Charity, Charity Std (lower is better), WikiTableQuestions, WikiTableQuestions Std (lower is better), TabFact, TabFact Std (lower is better)

Latest Results

Rows are parsed from the public DUE Benchmark submission API. Primary score is overall.

Rank Subject Overall Model Match Provenance Sampled
1 T5 + 2D + Self-supervised 59.76 Imported 2026-05-27
2 T5 + Self-Supervised 56.51 Imported 2026-05-27
3 T5 Baseline 50.71 Imported 2026-05-27
4 T5 + 2D 50.39 Imported 2026-05-27