Leaderboards
Curated benchmark baskets for specific model-comparison views.
2leaderboards
| Name | Benchmarks | Models | Top models | Description |
|---|---|---|---|---|
| BenchmarkList General benchmarklist_general | 35 | 123 | Claude Mythos Preview, Claude Opus 4.8, Qwen3.7 Max | Default general-purpose frontier-model leaderboard basket across reasoning, coding, writing, agentic, multimodal, and structured-output benchmarks. |
| Legal AI legal | 9 | 60 | Claude Opus 4.7, GPT-5.5, GPT-5 | Legal reasoning, contract, legal-agent, and jurisdiction-specific benchmark basket. |
No matching rows.