WebMainBench
Human-annotated web main-content extraction benchmark evaluating extractors and model-backed pipelines on full-page ROUGE-N F1 plus fine-grained text, code, formula, and table metrics.
14rows
rouge_n_f1_allprimary metric
2026-05-06sampled
Metadata
Metrics
ROUGE-N F1 All, ROUGE-N F1 Simple, ROUGE-N F1 Mid, ROUGE-N F1 Hard, Overall, Text Edit, Code Edit, Formula Edit, Table Edit, Table TEDS
| Rank | Subject | ROUGE-N F1 All | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | DeepSeek-V3.2 | 0.91 | DeepSeek V3.2 deepseek-deepseek-v3.2 | Imported | 2026-05-06 |
| 2 | GPT-5 | 0.90 | GPT-5 openai-gpt-5 | Imported | 2026-05-06 |
| 3 | Gemini-2.5-Pro | 0.90 | Gemini 2.5 Pro google-gemini-2.5-pro | Imported | 2026-05-06 |
| 4 | Dripper_fallback | 0.89 | — | Imported | 2026-05-06 |
| 5 | Dripper (0.6B) | 0.88 | — | Imported | 2026-05-06 |
| 6 | mineru-html | 0.83 | — | Imported | 2026-05-06 |
| 7 | magic-html | 0.71 | — | Imported | 2026-05-06 |
| 8 | Readability | 0.65 | — | Imported | 2026-05-06 |
| 9 | Trafilatura | 0.64 | — | Imported | 2026-05-06 |
| 10 | Resiliparse | 0.63 | — | Imported | 2026-05-06 |
| 11 | magic-html | 0.51 | — | Imported | 2026-05-06 |
| 12 | trafilatura (md) | 0.39 | — | Imported | 2026-05-06 |
| 13 | resiliparse | 0.30 | — | Imported | 2026-05-06 |
| 14 | trafilatura (txt) | 0.27 | — | Imported | 2026-05-06 |
No matching rows.