BEHAVIOR 2025 Challenge

Robotics challenge leaderboard for BEHAVIOR-1K household activity policies, ranked by Q-score and full-task success.

18rows
heldout_q_scoreprimary metric
2026-05-27sampled

Metadata

Metrics

Held-Out Q-Score, Public Validation Q-Score, Held-Out Full Task Success Rate, Public Validation Full Task Success Rate

Latest Results

Rows parsed from the public BEHAVIOR 2025 challenge leaderboard. The primary score uses held-out Q-score when available, otherwise public-validation Q-score.

Rank Subject Held-Out Q-Score Model Match Provenance Sampled
1 Robot Learning Collective 0.2599 Imported 2026-05-27
2 Comet 0.2514 Imported 2026-05-27
3 SimpleAI Robot 0.1591 Imported 2026-05-27
4 The North Star 0.1204 Imported 2026-05-27
5 Embodied Intelligence 0.0947 Imported 2026-05-27
6 RAPPER 0.075 Imported 2026-05-27
7 tobi 0.0717 Imported 2026-05-27
8 MR 0.0512 Imported 2026-05-27
9 RACL 0.014 Imported 2026-05-27
10 Ahri+EFFL+MLV 0.01 Imported 2026-05-27
11 Merlin Labs 0.009 Imported 2026-05-27
12 LYQRobotics 0.008 Imported 2026-05-27
13 ACT 0.0037 Imported 2026-05-27
14 StarVLA 0.0019 Imported 2026-05-27
15 Cloud-Data 0 Imported 2026-05-27
16 RobotSimArk 0 Imported 2026-05-27
17 EntropyMaximum 0 Imported 2026-05-27
18 Magikid 0 Imported 2026-05-27