BenchmarkList General
Default general-purpose frontier-model leaderboard basket across reasoning, coding, writing, agentic, multimodal, and structured-output benchmarks.
123models
35benchmarks
| General | Coding | Agents & Tools | Data | Multimodal | Other | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| # | Model | Avg Rank | HLE | GD | BenchLM | MMLU-Redux | LiveBench | MMLU-ProX | ABL | Multi-IF | CWV | BZ | WritingBen | AHV | EQ-Bench | ALE-Bench | SBP | LiveCodeBe | BH | NL2Repo | GLR | ARC-AGI-2 | PinchBench | BFC | MCPMark | AutoBench | SO | EG | OSWorld- | LW | TA | Automation | LiveSQLBen | VideoMMMU | MMMU-Pro | CC-OCR | MATH-500 |
| 1 | Claude Mythos Preview Claude | 1.0 5/35 rows | #1 | #1 | #1 | — | — | — | — | — | — | — | — | — | — | — | #1 | — | — | — | — | — | — | — | — | — | — | — | #1 | — | — | — | — | — | — | — | — |
| 2 | Claude Opus 4.8 Claude | 1.4 5/35 rows | #1 | #3 | — | — | — | — | — | — | — | — | — | — | — | — | #1 | — | — | — | — | — | — | — | — | — | — | — | #1 | — | — | #1 | — | — | — | — | — |
| 3 | Qwen3.7 Max Qwen | 1.4 8/35 rows | #1 | #1 | — | #3 | — | #1 | — | — | — | — | — | — | — | — | #1 | #2 | — | #2 | — | — | — | — | #1 | — | — | — | — | — | — | — | — | — | — | — | — |
| 4 | GPT-5.5 GPT | 2.1 13/35 rows | #2 | #2 | #3 | — | #1 | — | — | — | — | — | — | — | — | #1 | #2 | — | — | — | #1 | #1 | — | — | — | — | #7 | — | #1 | — | — | #2 | #3 | — | #1 | — | — |
| 5 | Claude Opus 4.7 Claude | 2.5 10/35 rows | #2 | #2 | #5 | — | — | — | — | — | — | — | — | — | — | — | #1 | — | — | — | #2 | #4 | — | — | — | #1 | #4 | — | #2 | — | — | #3 | — | — | — | — | — |
| 6 | Claude Opus 4.6 Claude | 4.3 19/35 rows | #2 | #2 | #10 | #2 | — | #2 | #1 | — | — | — | — | — | — | — | #5 | #4 | — | #1 | #6 | #14 | #1 | — | #4 | #2 | #12 | #1 | #4 | #4 | — | — | #2 | — | — | — | — |
| 7 | Qwen3 235B A22B Thinking 2507 Qwen | 4.4 7/35 rows | — | — | — | #5 | — | #5 | — | #1 | #5 | — | #1 | #3 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #10 | — | — | — | — | — | — |
| 8 | GPT-5.4 GPT | 5.1 15/35 rows | #4 | #5 | #8 | — | #2 | — | #23 | — | — | — | — | — | — | #3 | #3 | — | — | — | #8 | #5 | #3 | — | — | #6 | #1 | — | #3 | — | — | — | #8 | — | #2 | — | — |
| 9 | Gemini 3.1 Pro Preview Gemini | 5.7 17/35 rows | #1 | #1 | #2 | — | #3 | — | #4 | — | — | — | — | — | — | #19 | #4 | — | — | — | #24 | #3 | #18 | — | — | #3 | #2 | #4 | #4 | — | — | #4 | #1 | — | #3 | — | — |
| 10 | MoonshotAI: Kimi K2.6 Kimi | 6.3 14/35 rows | #4 | #3 | #12 | #1 | #23 | #6 | — | — | — | — | — | — | — | #22 | #2 | #3 | — | #3 | #5 | — | — | — | #5 | — | — | — | #5 | — | — | — | #5 | — | — | — | — |
| 11 | Qwen3 VL 235B A22B Thinking Qwen | 6.4 7/35 rows | — | — | — | #6 | — | #7 | — | #3 | #6 | — | #3 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #15 | — | #5 | — |
| 12 | DeepSeek V4 Pro DeepSeek | 7.4 13/35 rows | #3 | #5 | #9 | #4 | #13 | #4 | — | — | — | — | — | — | — | #26 | #3 | #1 | — | #5 | #15 | — | — | — | #3 | — | #13 | — | — | — | — | — | — | — | — | — | — |
| 13 | Qwen3 Next 80B A3B Thinking Qwen | 9.7 6/35 rows | — | — | — | #15 | — | #10 | — | #5 | — | — | #9 | #10 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #9 | — | — | — | — | — | — |
| 14 | GLM 5.1 GLM | 9.7 15/35 rows | #5 | #6 | #14 | #6 | #29 | #5 | — | — | — | — | — | — | — | #35 | #4 | — | — | #1 | #11 | — | #29 | — | #2 | #5 | #3 | — | — | — | — | — | #6 | — | — | — | — |
| 15 | GPT-5.3-Codex Codex | 10.5 9/35 rows | #6 | #6 | #11 | — | #18 | — | #14 | — | — | — | — | — | — | #2 | — | — | — | — | #22 | — | — | — | — | — | — | — | #7 | — | — | — | #9 | — | — | — | — |
| 16 | GPT-5.2 GPT | 12.3 14/35 rows | #13 | #13 | #19 | — | #11 | — | #3 | — | — | — | — | — | — | #10 | — | — | — | — | #23 | #28 | — | #16 | #1 | — | — | #6 | — | #9 | — | — | — | #4 | #7 | — | — |
| 17 | GPT-5.4 Pro GPT | 13.1 5/35 rows | #1 | #1 | #4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #2 | #67 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 18 | Qwen3.6 Plus Qwen | 13.1 17/35 rows | #6 | #4 | #29 | #2 | #28 | #1 | — | — | — | — | — | — | — | #52 | #6 | #5 | — | #3 | #17 | — | #61 | — | #6 | #8 | — | — | #8 | — | — | — | — | #8 | — | #1 | — |
| 19 | Qwen3 VL 30B A3B Thinking Qwen | 13.4 8/35 rows | — | — | — | #20 | — | #15 | — | #12 | #11 | — | #7 | #13 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #17 | — | #13 | — |
| 20 | Qwen3 VL 8B Thinking Qwen | 14.7 8/35 rows | — | — | — | #25 | — | #19 | — | #9 | #12 | — | #5 | #14 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #19 | — | #16 | — |
| 21 | Gemini 3 Gemini | 15.2 13/35 rows | #9 | #11 | #18 | — | — | — | #27 | — | — | — | — | — | — | #17 | — | — | — | — | #14 | #35 | #56 | #3 | #2 | — | — | #9 | — | #2 | — | — | — | #1 | — | — | — |
| 22 | Claude Opus 4.5 Claude | 15.4 10/35 rows | #29 | #39 | #23 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #7 | — | #16 | #1 | #7 | — | — | #3 | — | #6 | — | — | — | — | #18 | — | — |
| 23 | Claude Sonnet 4.6 Claude | 15.6 10/35 rows | #24 | #29 | #15 | — | — | — | #20 | — | — | — | — | — | — | — | — | — | — | — | #9 | #23 | #13 | — | — | #4 | #11 | #2 | — | — | — | — | — | — | — | — | — |
| 24 | GPT-5.2-Codex Codex | 16.2 6/35 rows | #18 | #15 | #24 | — | #15 | — | — | — | — | — | — | — | — | #9 | — | — | — | — | #16 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 25 | Gemini 3 Flash Preview Gemini | 16.6 13/35 rows | #15 | #16 | #40 | — | #19 | — | #6 | — | — | — | — | — | — | #6 | — | — | — | — | #18 | #33 | #17 | — | — | #13 | #22 | #5 | — | — | — | — | — | #2 | — | — | — |
| 26 | Claude Sonnet 4.5 Claude | 17.4 11/35 rows | — | — | #37 | — | — | — | #22 | — | — | — | — | — | — | — | — | — | — | — | #13 | #46 | #11 | #2 | #9 | — | — | #7 | — | #13 | — | — | #12 | — | #23 | — | — |
| 27 | Qwen3.5 397B A17B Qwen | 18.7 8/35 rows | #35 | #19 | — | #1 | — | #1 | — | — | — | #1 | — | — | — | #58 | — | — | — | — | #27 | — | #7 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 28 | GPT-5 GPT | 20.6 13/35 rows | #38 | #51 | #22 | — | — | — | — | — | — | — | — | — | — | #18 | — | — | — | — | — | #50 | — | — | #3 | — | #16 | #8 | — | #3 | #8 | — | #11 | #6 | #11 | — | — |
| 29 | Qwen3.5-122B-A10B Qwen | 21.7 11/35 rows | #55 | #48 | #39 | #4 | — | #3 | — | — | — | #2 | — | — | — | — | — | — | — | — | — | — | #25 | — | — | #17 | — | — | #9 | — | — | — | — | #13 | — | #4 | — |
| 30 | MiMo-V2.5-Pro Xiaomi | 22.0 4/35 rows | #16 | #40 | — | — | — | — | — | — | — | — | — | — | — | #34 | — | — | — | — | #4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 31 | Grok 4.3 Grok | 22.1 5/35 rows | #14 | #14 | — | — | #41 | — | — | — | — | — | — | — | — | #30 | — | — | — | — | #20 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 32 | MoonshotAI: Kimi K2.5 Kimi | 22.4 13/35 rows | #26 | #27 | #25 | — | #32 | — | #2 | — | — | — | — | — | — | #37 | #2 | — | — | — | #37 | #49 | #28 | — | — | #9 | — | #11 | — | — | — | — | — | #3 | — | — | — |
| 33 | Grok 4.20 Grok | 24.0 9/35 rows | #20 | #8 | #38 | — | #34 | — | — | — | — | — | — | — | — | #20 | — | — | — | — | #39 | #20 | #32 | — | — | #11 | — | — | — | — | — | — | — | — | — | — | — |
| 34 | Claude Haiku 4.5 Claude | 26.6 7/35 rows | — | — | #52 | — | — | — | #28 | — | — | — | — | — | — | — | — | — | — | — | — | #75 | #6 | #6 | — | #12 | — | — | — | — | #7 | — | — | — | — | — | — |
| 35 | DeepSeek V4 Flash DeepSeek | 27.3 6/35 rows | #21 | #18 | #26 | — | #40 | — | — | — | — | — | — | — | — | #50 | — | — | — | — | #21 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 36 | Qwen3.5-27B Qwen | 27.8 13/35 rows | #62 | #46 | #45 | #11 | — | #3 | #32 | — | — | #9 | — | — | — | #75 | — | — | — | — | #38 | — | #4 | — | — | — | — | — | #10 | — | — | — | — | #12 | — | #7 | — |
| 37 | GPT-5.1 GPT | 28.1 10/35 rows | #39 | #32 | #21 | — | #21 | — | #31 | — | — | — | — | — | — | #15 | — | — | — | — | #47 | #43 | — | — | — | — | — | — | — | — | #3 | — | — | — | #10 | — | — |
| 38 | MiniMax M2.7 MiniMax | 28.9 10/35 rows | #31 | #31 | #46 | — | #45 | — | — | — | — | — | — | — | — | #61 | — | — | — | #2 | #50 | — | #5 | — | — | #10 | — | #13 | — | — | — | — | — | — | — | — | — |
| 39 | MiMo-V2-Pro Xiaomi | 29.0 6/35 rows | #30 | #35 | — | — | — | — | — | — | — | — | — | — | — | #45 | — | — | — | — | #41 | — | #15 | — | — | #7 | — | — | — | — | — | — | — | — | — | — | — |
| 40 | GPT-5.4 Mini GPT | 29.5 8/35 rows | #37 | #30 | — | — | #39 | — | — | — | — | — | — | — | — | #16 | — | — | — | — | — | #41 | #48 | — | — | #15 | — | — | #6 | — | — | — | — | — | — | — | — |
| 41 | GLM 4.7 GLM | 30.8 9/35 rows | #49 | #45 | #34 | — | — | — | — | — | — | #5 | — | — | — | #72 | — | — | — | — | #35 | — | — | — | — | #14 | #5 | — | — | — | — | — | #22 | — | — | — | — |
| 42 | Claude Sonnet 4 Claude | 31.2 5/35 rows | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #20 | — | — | — | #62 | #38 | — | #15 | — | — | — | — | — | — | — | #16 | — | — | — | — |
| 43 | Grok 4 Grok | 31.5 7/35 rows | #52 | #28 | #42 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #31 | #44 | — | #9 | #10 | — | — | — | — | — | — | — | — | — | — | — | — |
| 44 | o3 o-series | 31.9 11/35 rows | #69 | #79 | #53 | — | — | — | — | — | — | — | — | — | — | — | — | #2 | — | — | — | #58 | — | #8 | #18 | — | — | — | — | — | #6 | — | #15 | #11 | #14 | — | — |
| 45 | Qwen3.6 27B Qwen | 32.3 8/35 rows | #63 | #64 | #28 | #7 | — | — | — | — | — | — | — | — | — | — | — | — | — | #4 | #45 | — | — | — | — | — | — | — | — | — | — | — | — | #7 | — | #6 | — |
| 46 | Qwen3.6 35B A3B Qwen | 33.7 8/35 rows | #67 | #65 | #36 | #9 | — | — | — | — | — | — | — | — | — | — | — | — | — | #5 | #42 | — | — | — | — | — | — | — | — | — | — | — | — | #9 | — | #3 | — |
| 47 | Qwen3.5-35B-A3B Qwen | 34.5 13/35 rows | #72 | #60 | #56 | #9 | — | #5 | — | — | — | #3 | — | — | — | #85 | — | — | — | — | #52 | — | #43 | — | — | #18 | — | — | #11 | — | — | — | — | #14 | — | #8 | — |
| 48 | DeepSeek V3.2 DeepSeek | 36.2 13/35 rows | #60 | #67 | #47 | — | #48 | — | #16 | — | — | #6 | — | — | — | — | #9 | — | — | — | #43 | #76 | #30 | — | #8 | #29 | — | #12 | — | — | — | — | — | — | — | — | — |
| 49 | MoonshotAI: Kimi K2 Thinking Kimi | 37.4 8/35 rows | #59 | #69 | — | #3 | — | — | — | — | — | #8 | #15 | — | — | #62 | — | — | — | — | #49 | — | — | — | — | — | — | #19 | — | — | — | — | — | — | — | — | — |
| 50 | Grok 4.1 Fast Grok | 37.9 10/35 rows | #82 | #52 | #33 | — | — | — | #34 | — | — | — | — | — | — | #73 | — | — | — | — | #29 | — | #34 | #5 | — | #16 | — | — | — | #12 | — | — | — | — | — | — | — |
| 51 | Gemma 4 31B Gemma | 38.0 9/35 rows | #56 | #47 | — | — | #50 | — | — | — | — | — | — | — | — | #33 | — | — | — | — | #44 | — | #47 | — | — | #21 | #21 | — | — | — | — | — | — | — | #13 | — | — |
| 52 | GPT-5.1-Codex Codex | 38.0 5/35 rows | #54 | #44 | — | — | #31 | — | — | — | — | — | — | — | — | #12 | — | — | — | — | #34 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 53 | R1 0528 DeepSeek | 38.1 5/35 rows | — | — | — | #8 | — | — | — | — | — | #13 | — | — | — | #39 | — | #5 | — | — | — | #115 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 54 | GLM 5 GLM | 38.5 10/35 rows | #36 | #86 | #17 | — | #36 | — | #7 | — | — | — | — | — | — | #46 | — | — | — | — | #26 | #69 | #20 | — | — | — | — | #15 | — | — | — | — | — | — | — | — | — |
| 55 | Qwen3 Max Thinking Qwen | 38.9 4/35 rows | #41 | #42 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #33 | — | #39 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 56 | GPT-5 Mini GPT | 39.9 11/35 rows | #71 | #77 | — | — | #42 | — | — | — | — | — | — | — | — | #40 | — | — | — | — | — | #71 | #40 | #17 | #11 | — | #20 | #18 | — | #10 | — | — | — | — | — | — | — |
| 57 | Qwen3 235B A22B Instruct 2507 Qwen | 40.8 11/35 rows | #95 | #112 | — | #12 | — | #8 | — | #6 | #3 | — | #7 | #4 | — | — | — | — | — | — | — | #111 | — | #23 | — | — | — | — | — | — | #20 | — | — | — | — | — | — |
| 58 | MiMo-V2-Flash Xiaomi | 41.8 6/35 rows | #65 | #58 | #48 | — | — | — | — | — | — | — | — | #1 | — | #48 | — | — | — | — | — | — | #8 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 59 | GPT-5.4 Nano GPT | 41.8 8/35 rows | #40 | #88 | — | — | #26 | — | — | — | — | — | — | — | — | #27 | — | — | — | — | — | #64 | #42 | — | — | #23 | — | — | #12 | — | — | — | — | — | — | — | — |
| 60 | GLM 5 Turbo GLM | 44.8 4/35 rows | #47 | #56 | — | — | — | — | — | — | — | — | — | — | — | #56 | — | — | — | — | — | — | #19 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 61 | MiniMax M2.5 MiniMax | 45.0 8/35 rows | #74 | #55 | — | — | — | — | #5 | — | — | — | — | — | — | #59 | — | — | — | — | #36 | #68 | #14 | — | — | #22 | — | — | — | — | — | — | — | — | — | — | — |
| 62 | Nemotron 3 Super Nemotron | 45.7 9/35 rows | #73 | #101 | — | — | — | #9 | — | — | — | — | — | #6 | — | #86 | — | — | — | — | #60 | — | #10 | — | — | #20 | — | — | — | — | #15 | — | — | — | — | — | — |
| 63 | Gemini 2.5 Pro Gemini | 47.4 8/35 rows | #64 | #61 | #41 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #30 | #67 | #53 | — | #29 | — | — | #20 | — | — | — | — | — | — | — | — | — |
| 64 | MiniMax M2.1 MiniMax | 47.8 5/35 rows | #61 | #75 | — | — | — | — | — | — | — | — | — | — | — | #57 | — | — | — | — | — | — | #12 | — | — | — | — | — | — | — | — | — | #23 | — | — | — | — |
| 65 | MiMo-V2.5 Xiaomi | 48.3 4/35 rows | #48 | #53 | — | — | — | — | — | — | — | — | — | — | — | #68 | — | — | — | — | #32 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 66 | o4 Mini o-series | 49.2 7/35 rows | #83 | #118 | — | — | — | — | — | — | — | — | — | — | — | — | — | #3 | — | — | — | #61 | — | #21 | #26 | — | — | — | — | — | — | — | #14 | — | — | — | — |
| 67 | Step 3.5 Flash StepFun | 50.8 4/35 rows | #57 | #74 | — | — | — | — | #33 | — | — | — | — | — | — | — | — | — | — | — | — | — | #26 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 68 | gpt-oss-120b GPT | 51.3 11/35 rows | #78 | #121 | #84 | — | — | — | #15 | — | — | — | — | — | — | #64 | #8 | — | — | — | #54 | — | #59 | — | #36 | #24 | — | #14 | — | — | — | — | — | — | — | — | — |
| 69 | Gemma 4 26B A4B Gemma | 51.6 8/35 rows | #80 | #109 | — | — | — | — | — | — | — | — | — | — | — | #32 | — | — | — | — | #53 | — | #31 | — | — | #31 | — | #23 | — | — | — | — | — | — | #20 | — | — |
| 70 | Gemini 3.1 Flash Lite Preview Gemini | 54.0 5/35 rows | #89 | #85 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #48 | — | — | — | — | #19 | — | — | — | — | — | — | — | #5 | — | — | — |
| 71 | R1 DeepSeek | 54.6 9/35 rows | #96 | #90 | #86 | — | — | — | #12 | — | — | #6 | — | — | — | — | — | — | #14 | — | — | #107 | — | — | — | — | — | — | — | #7 | — | — | #18 | — | — | — | — |
| 72 | Qwen3 235B A22B Qwen | 57.5 9/35 rows | #129 | #203 | #68 | #28 | — | — | — | — | — | — | — | — | — | — | #7 | #13 | — | — | — | — | — | — | — | — | #9 | #22 | — | — | — | — | #17 | — | — | — | — |
| 73 | Qwen3 Next 80B A3B Instruct Qwen | 58.9 9/35 rows | #202 | #169 | — | #20 | — | #13 | — | #8 | #8 | — | #2 | #2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #18 | — | — | — | — | — | — |
| 74 | MoonshotAI: Kimi K2 0711 Kimi | 59.0 9/35 rows | #213 | #141 | #75 | #13 | — | — | — | — | — | — | — | — | — | — | #6 | — | — | — | — | — | — | #11 | — | — | — | — | — | #11 | #13 | — | — | — | — | — | #6 |
| 75 | Qwen3.5-9B Qwen | 59.9 6/35 rows | #108 | #99 | — | #19 | — | #14 | #10 | — | — | — | — | — | — | — | — | — | — | — | — | — | #63 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 76 | Grok 4 Fast Grok | 60.6 4/35 rows | #87 | #57 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #65 | — | — | #20 | — | — | — | — | — | — | — | — | — | — | — | — |
| 77 | Gemini 2.5 Flash Gemini | 61.7 9/35 rows | #135 | #111 | #80 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #85 | #57 | #15 | #31 | — | #8 | — | — | #8 | — | — | — | — | — | — | — |
| 78 | Qwen3 VL 235B A22B Instruct Qwen | 61.8 10/35 rows | #235 | #195 | — | #16 | — | #11 | — | #7 | #4 | — | #5 | #5 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #18 | — | #2 | — |
| 79 | GLM 4.6 GLM | 62.8 5/35 rows | #106 | #122 | — | — | — | — | — | — | — | — | — | — | — | #77 | #12 | — | — | — | — | — | — | #4 | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 80 | DeepSeek V3.2 Exp DeepSeek | 62.8 4/35 rows | #103 | #104 | — | — | — | — | — | — | — | #12 | — | — | — | — | — | — | — | — | — | — | — | #14 | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 81 | Qwen3 VL 30B A3B Instruct Qwen | 68.3 10/35 rows | #230 | #211 | — | #27 | — | #18 | — | #18 | #9 | — | #13 | #12 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #21 | — | #8 | — |
| 82 | GLM 5V Turbo GLM | 70.3 4/35 rows | #91 | #97 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #56 | — | #24 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 83 | Mercury 2 Inception | 72.3 5/35 rows | #92 | #134 | — | — | — | — | — | — | — | — | — | — | — | #43 | — | — | — | — | — | — | #44 | — | — | — | — | — | — | — | #16 | — | — | — | — | — | — |
| 84 | o3-mini o-series | 74.9 8/35 rows | #176 | #163 | #57 | — | — | — | — | #2 | — | — | — | — | — | — | — | #15 | #1 | — | — | #91 | — | — | — | — | — | — | — | — | — | — | #10 | — | — | — | — |
| 85 | Qwen3 VL 32B Instruct Qwen | 75.0 9/35 rows | #236 | #231 | — | #22 | — | #16 | — | #14 | #7 | — | #12 | #9 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #10 | — |
| 86 | DeepSeek V3.1 DeepSeek | 75.9 5/35 rows | #112 | #123 | #90 | #18 | — | — | — | — | — | #10 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 87 | GLM 4.5 GLM | 76.2 6/35 rows | #123 | #120 | #93 | — | — | — | — | — | — | — | — | — | — | #76 | — | — | — | — | — | — | — | — | #30 | — | — | — | — | — | — | — | — | — | — | — | #3 |
| 88 | Nemotron 3 Nano 30B A3B Nemotron | 76.9 7/35 rows | #149 | #153 | #97 | — | — | #22 | — | — | — | — | — | #8 | — | — | — | — | — | — | — | — | — | — | — | #25 | — | — | — | — | #17 | — | — | — | — | — | — |
| 89 | DeepSeek V3.1 Terminus DeepSeek | 78.3 5/35 rows | #94 | #108 | #96 | — | — | — | — | — | — | — | — | — | — | #47 | — | — | — | — | — | — | — | — | #28 | — | — | — | — | — | — | — | — | — | — | — | — |
| 90 | MoonshotAI: Kimi K2 0905 Kimi | 82.4 7/35 rows | #233 | #139 | — | #13 | — | — | — | — | — | — | — | — | — | #81 | — | — | — | — | — | — | — | — | #21 | — | — | — | — | — | #13 | — | — | — | — | — | #6 |
| 91 | Trinity Large Thinking Arcee | 83.8 4/35 rows | #99 | #156 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #59 | — | #2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 92 | o3 Mini High o-series | 87.0 4/35 rows | #122 | #130 | — | — | — | — | — | — | — | — | — | — | — | — | — | #9 | — | — | — | #80 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 93 | GPT-5 Nano GPT | 88.8 9/35 rows | #183 | #225 | — | — | — | — | #39 | — | — | — | — | — | — | #49 | — | — | — | — | — | #84 | #58 | #24 | #34 | — | — | — | — | #17 | — | — | — | — | — | — | — |
| 94 | gpt-oss-20b GPT | 94.0 8/35 rows | #157 | #214 | #109 | — | — | — | #25 | — | — | — | — | — | — | #65 | — | — | — | — | — | — | #60 | — | — | #28 | #28 | — | — | — | — | — | — | — | — | — | — |
| 95 | Qwen3 Coder Next Qwen | 97.4 4/35 rows | #168 | #171 | — | — | — | — | — | — | — | — | — | — | — | — | #3 | — | — | — | — | — | #41 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 96 | Mistral: Mistral Small 4 Mistral | 98.5 5/35 rows | #166 | #135 | — | — | — | — | — | — | — | — | — | — | — | #69 | — | — | — | — | — | — | #46 | — | — | #26 | — | — | — | — | — | — | — | — | — | — | — |
| 97 | GPT-4.1 GPT | 102.3 11/35 rows | #345 | #236 | #51 | — | — | — | — | #15 | — | — | — | — | — | #66 | — | — | #7 | — | #58 | #126 | — | #20 | #33 | — | #15 | — | — | — | — | — | — | — | — | — | — |
| 98 | Qwen3 Max Qwen | 103.6 4/35 rows | #137 | #144 | — | — | — | — | — | — | — | — | — | — | — | #74 | — | — | — | — | — | — | — | — | #25 | — | — | — | — | — | — | — | — | — | — | — | — |
| 99 | GLM 4.5 Air GLM | 113.2 5/35 rows | #217 | #174 | #108 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #22 | — | — | — | — | — | — | — | — | — | — | — | — | — | #4 |
| 100 | Llama 4 Maverick Llama | 116.1 10/35 rows | #326 | #230 | #111 | — | — | — | #19 | — | — | — | — | — | — | #88 | #13 | — | — | — | — | #133 | #62 | — | — | #32 | — | — | — | — | — | — | #28 | — | — | — | — |
| 101 | Grok Code Fast 1 Grok | 123.7 5/35 rows | #198 | #179 | #78 | — | — | — | — | — | — | — | — | — | — | #63 | — | — | — | — | — | — | — | — | #23 | — | — | — | — | — | — | — | — | — | — | — | — |
| 102 | DeepSeek V3 DeepSeek | 124.3 10/35 rows | #450 | #310 | #82 | #24 | — | — | — | — | — | — | — | — | — | — | — | #27 | #21 | — | — | — | #54 | — | #27 | — | — | — | — | — | — | — | #24 | — | — | — | #27 |
| 103 | o1 o-series | 127.5 4/35 rows | #196 | #164 | #55 | — | — | — | — | — | — | — | — | — | — | — | — | — | #2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 104 | GPT-4.1 Mini GPT | 129.6 8/35 rows | #346 | #238 | #70 | — | — | — | — | #17 | — | — | — | — | — | — | — | — | #8 | — | — | #136 | — | #27 | #38 | — | — | — | — | — | — | — | — | — | — | — | — |
| 105 | Gemma 3 27B Gemma | 140.5 7/35 rows | #334 | #378 | #110 | — | — | — | — | — | — | — | — | — | — | — | #10 | — | — | — | — | — | — | #69 | — | — | #17 | — | — | #16 | — | — | — | — | — | — | — |
| 106 | GPT-4o GPT | 144.0 6/35 rows | #303 | #248 | #74 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #55 | — | — | — | — | — | — | — | — | — | #26 | — | #35 | — | — |
| 107 | Claude 3.7 Sonnet Claude | 144.5 5/35 rows | #322 | #245 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #4 | — | — | — | — | — | — | — | — | — | — | — | — | — | #21 | — | — | — | #14 |
| 108 | Qwen3 VL 8B Instruct Qwen | 151.8 8/35 rows | #475 | #381 | — | #31 | — | #20 | — | #9 | — | — | #11 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #22 | — | #11 | — |
| 109 | Llama 4 Scout Llama | 157.8 8/35 rows | #378 | #290 | #106 | — | — | — | #26 | — | — | — | — | — | — | — | — | — | — | — | — | #134 | #68 | #72 | — | — | — | — | — | — | — | — | #27 | — | — | — | — |
| 110 | Gemini 2.5 Flash Lite Gemini | 159.8 4/35 rows | #228 | #261 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #66 | #52 | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 111 | GPT-4.1 Nano GPT | 165.6 8/35 rows | #425 | #339 | #95 | — | — | — | — | #20 | — | — | — | — | — | — | — | — | #22 | — | — | #135 | — | #58 | #39 | — | — | — | — | — | — | — | — | — | — | — | — |
| 112 | GPT-4o (2024-08-06) GPT | 173.2 6/35 rows | #474 | #329 | — | — | — | — | — | #19 | — | — | — | — | — | — | — | #24 | — | — | — | — | — | — | — | — | — | — | — | — | #18 | — | — | #23 | — | — | — |
| 113 | Gemini 2.0 Flash Gemini | 173.8 4/35 rows | #269 | #255 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #106 | — | — | — | — | — | — | — | #14 | — | — | — | — | — | — | — |
| 114 | Phi 4 Phi | 177.7 6/35 rows | #408 | #298 | #92 | — | — | — | #36 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #70 | — | — | #25 | — | — | — | — | — | — | — | — | — | — |
| 115 | Qwen3 Coder 480B A35B Qwen | 190.4 4/35 rows | #374 | #266 | — | — | — | — | — | — | — | — | — | — | — | #70 | #4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 116 | Claude 3.5 Sonnet Claude | 192.3 5/35 rows | #422 | #280 | #77 | — | — | — | — | — | — | — | — | — | — | — | — | #23 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #36 | — | — |
| 117 | GPT-4 Turbo GPT | 192.5 4/35 rows | #467 | — | #98 | — | — | — | — | — | — | — | — | — | — | — | — | #25 | #17 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 118 | Gemma 3 12B Gemma | 221.3 4/35 rows | #324 | #416 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #66 | — | — | — | — | — | #18 | — | — | — | — | — | — | — |
| 119 | GPT-4o-mini GPT | 226.0 5/35 rows | #412 | #382 | #63 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #50 | — | — | — | — | — | — | — | — | — | — | — | #50 | — | — |
| 120 | Gemma 3 4B Gemma | 226.7 4/35 rows | #276 | #453 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | #101 | — | — | — | — | — | #27 | — | — | — | — | — | — | — |
| 121 | Qwen2.5 72B Instruct Qwen | 246.7 4/35 rows | #398 | #349 | #64 | #29 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 122 | Claude 3 Haiku Claude | 263.5 4/35 rows | #421 | #407 | #102 | — | — | — | — | — | — | — | — | — | — | — | — | #28 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
| 123 | Qwen2.5 Coder 32B Instruct Qwen | 264.6 4/35 rows | #434 | #386 | — | #39 | — | — | — | — | — | — | — | — | — | — | — | — | #24 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
No matching rows.
| Group | Weight | Benchmark | Rows |
|---|---|---|---|
| Core drivers | 1.5x | Humanity's Last Exam | 176 |
| Core drivers | 1.5x | SWE-bench Pro | 22 |
| Core drivers | 1.5x | Creative Writing v3 | 9 |
| High-signal support | 1.25x | ARC-AGI-2 | 51 |
| High-signal support | 1.25x | GPQA Diamond | 176 |
| High-signal support | 1.25x | MATH-500 | 10 |
| High-signal support | 1.25x | LiveCodeBench | 21 |
| High-signal support | 1.25x | Gert Labs Rankings | 59 |
| High-signal support | 1.25x | Structured Output Benchmark | 19 |
| High-signal support | 1.25x | Berkeley Function-Calling Leaderboard | 36 |
| High-signal support | 1.25x | OSWorld-Verified | 15 |
| Balanced support | 1x | MMLU-ProX | 26 |
| Balanced support | 1x | MMLU-Redux | 35 |
| Balanced support | 1x | LiveBench | 26 |
| Balanced support | 1x | MMMU-Pro | 15 |
| Balanced support | 1x | NL2Repo | 9 |
| Balanced support | 1x | EQ-Bench | 8 |
| Balanced support | 1x | WritingBench | 12 |
| Balanced support | 1x | PinchBench | 63 |
| Balanced support | 1x | EnterpriseOps-Gym | 19 |
| Balanced support | 1x | AutomationBench | 5 |
| Balanced support | 1x | AutoBench | 32 |
| Balanced support | 1x | MCPMark | 33 |
| Balanced support | 1x | Tau2 Airline | 14 |
| Balanced support | 1x | LLM-WikiRace | 16 |
| Balanced support | 1x | LiveSQLBench | 25 |
| Balanced support | 1x | Multi-IF | 18 |
| Balanced support | 1x | BrowseComp-zh | 12 |
| Balanced support | 1x | VideoMMMU | 21 |
| Breadth / tie-breakers | 0.75x | Arena-Hard v2 | 12 |
| Breadth / tie-breakers | 0.75x | BigCodeBench-Hard | 13 |
| Breadth / tie-breakers | 0.75x | CC-OCR | 14 |
| Breadth / tie-breakers | 0.75x | BenchLM | 74 |
| Breadth / tie-breakers | 0.75x | ALL Bench LLM | 27 |
| Breadth / tie-breakers | 0.75x | ALE-Bench | 65 |