Physical AI Bench Conditional Generation

Physical AI Bench conditional generation leaderboard for controlled world-model generation under blur, edge, depth, segmentation, and combined conditioning settings.

11rows
quality_scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Blur SSIM, Edge F1, Depth si-RMSE (lower is better), Mask mIoU, Quality Score, Diversity

Latest Results

Rank Subject Quality Score Model Match Provenance Sampled
1 Wan2.2-Fun-5B-Control (Depth) 9.32 Imported 2026-05-06
2 Cosmos-Transfer2.5-2B (All) 9.24 Imported 2026-05-06
3 Wan2.2-Fun-A14B-Control (Depth) 9.22 Imported 2026-05-06
4 Wan2.2-Fun-A14B-Control (Edge) 9.00 Imported 2026-05-06
5 Wan2.2-Fun-A14B-Control (Blur) 8.81 Imported 2026-05-06
6 Wan2.2-Fun-5B-Control (Edge) 8.79 Imported 2026-05-06
7 Cosmos-Transfer2.5-2B (Blur) 8.77 Imported 2026-05-06
8 Cosmos-Transfer2.5-2B (Edge) 8.04 Imported 2026-05-06
9 Cosmos-Transfer2.5-2B (Seg) 7.87 Imported 2026-05-06
10 Wan2.2-Fun-A14B-Control (Seg) 7.79 Imported 2026-05-06
11 Cosmos-Transfer2.5-2B (Depth) 7.30 Imported 2026-05-06