SLVMEval
Synthetic meta-evaluation benchmark for text-to-long-video evaluation systems, covering long videos and ten quality/consistency aspects.
9rows
macro_accuracyprimary metric
2026-05-28sampled
Metadata
Metrics
Macro Accuracy, Aesthetics Accuracy, Technical Quality Accuracy, Appearance/Style Accuracy, Background Consistency Accuracy, Object Integrity Accuracy, Color Accuracy, Dynamics Degree Accuracy, Comprehensiveness Accuracy, Spatial Relationship Accuracy, Temporal Flow Accuracy
| Rank | Subject | Macro Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Human | 91.73% | — | Imported | 2026-05-28 |
| 2 | Video-based GPT-5 | 71.66% | — | Imported | 2026-05-28 |
| 3 | Text-based GPT-5-mini | 61.35% | — | Imported | 2026-05-28 |
| 4 | Video-based GPT-5-mini | 61.32% | — | Imported | 2026-05-28 |
| 5 | CLIPScore | 60.74% | — | Imported | 2026-05-28 |
| 6 | Text-based GPT-5 | 60.70% | — | Imported | 2026-05-28 |
| 7 | Text-based Qwen3 | 56.92% | — | Imported | 2026-05-28 |
| 8 | Video-based Qwen3 | 50.44% | — | Imported | 2026-05-28 |
| 9 | VideoScore | 50.20% | — | Imported | 2026-05-28 |
No matching rows.