WHOOPS Explanation of Violation

WHOOPS evaluates vision-language models on commonsense-defying images; this leaderboard snapshot tracks the explanation-of-violation human metric from the public WHOOPS full leaderboard.

11rows
human_metric_explanation_of_violationprimary metric
2026-05-06sampled

Metadata

Metrics

Human Metric - Explanation of Violation, Auto Metric - Explanation of Violation, Identify - Explanation of Violation, Image Captioning, Visual Question Answering, Image-Text Matching

Latest Results

Rank Subject Human Metric - Explanation of Violation Model Match Provenance Sampled
1 Humans 95 Imported 2026-05-06
2 Ground-truth Caption _ GPT3 (Oracle) 68 Imported 2026-05-06
3 Predicted Caption _ GPT3 33 Imported 2026-05-06
4 BLIP2 FlanT5-XXL (Fine-tuned) 27 Imported 2026-05-06
5 BLIP2 FlanT5-XL (Fine-tuned) 15 Imported 2026-05-06
6 BLIP Large (Zero-shot) 0 Imported 2026-05-06
7 BLIP2 FlanT5-XXL (Text only FT) 0 Imported 2026-05-06
8 BLIP2 FlanT5-XXL (Zero-shot) 0 Imported 2026-05-06
9 CLIP ViT-L/14 (Zero-shot) 0 Imported 2026-05-06
10 CoCa ViT-L-14 MSCOCO (Zero-shot) 0 Imported 2026-05-06
11 OFA Large (Zero-shot) 0 Imported 2026-05-06