BPL OCR Bench

Pairwise VLM-as-judge OCR benchmark ranking OCR models on British Library / BPL document images with Bradley-Terry ELO scores and bootstrap confidence intervals.

4rows
eloprimary metric
2026-05-06sampled

Metadata

Metrics

ELO, Win%, Wins, Losses (lower is better), Ties, ELO Low, ELO High

Latest Results

Rows are parsed from the public Hugging Face dataset-server rows API. ELO rankings come from VLM-as-judge pairwise comparisons using Bradley-Terry MLE with bootstrap confidence intervals.

Rank Subject ELO Model Match Provenance Sampled
1 lightonai/LightOnOCR-2-1B 1559 Imported 2026-05-06
2 zai-org/GLM-OCR 1535 Imported 2026-05-06
3 rednote-hilab/dots.ocr 1453 Imported 2026-05-06
4 deepseek-ai/DeepSeek-OCR 1452 Imported 2026-05-06