GeneBench
GeneBench is an evaluation focused on multi-stage scientific data analysis in genetics and quantitative biology. Tasks require reasoning about ambiguous or noisy data with minimal supervisory guidance, addressing realistic obstacles such as hidden confounders or QC failures, and correctly implementing and interpreting modern statistical methods.
6rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Normalized Score
Showing 2 latest source slices.
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-5.5 Pro | 0.33 | GPT-5.5 Pro openai-gpt-5.5-pro | Self-reported | 2026-05-06 |
| 2 | GPT-5.5 | 0.25 | GPT-5.5 openai-gpt-5.5 | Self-reported | 2026-05-06 |
| 1 | GPT-5.5 Pro | 33.2% | GPT-5.5 Pro openai-gpt-5.5-pro | Launch post | 2026-04-23 |
| 2 | GPT-5.4 Pro | 25.6% | GPT-5.4 Pro openai-gpt-5.4-pro | Launch post | 2026-04-23 |
| 3 | GPT-5.5 | 25% | GPT-5.5 openai-gpt-5.5 | Launch post | 2026-04-23 |
| 4 | GPT-5.4 | 19% | GPT-5.4 openai-gpt-5.4 | Launch post | 2026-04-23 |
No matching rows.