Qwen3 Max
Qwen / Qwen
48scores
39benchmarks
$0.78 / $3.9 per 1M tokenscost in/out
Metadata
Qwen Open source
Aliases: qwen-qwen3-max, qwen/qwen3-max, qwen3-max
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| MCPMark | Agentic | 25 | 0.18 | 2026-05-06 |
| t2-bench | Agentic | 16 | 0.75 | 2026-05-06 |
| Tau2-Bench Telecom | Agentic | 124 | 74.3% | 2026-05-11 |
| Tau2-Bench Telecom | Agentic | 228 | 32.7% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 146 | 20.5% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 147 | 19.7% | 2026-05-11 |
| UAVBench | Agentic | 4 | 79.85 | 2026-05-06 |
| Vending-Bench 2 | Agentic | 35 | 71.57 | 2026-05-28 |
| VitaBench | Agentic | 21 | 14.30 | 2026-05-06 |
| OpenUGI | Alignment | 261 | 43.71 | 2026-05-06 |
| ALE-Bench | Coding | 74 | 370.45 | 2026-05-06 |
| IOI | Coding | 20 | 15.666% | 2026-05-26 |
| LiveCodeBench | Coding | 50 | 78.215% | 2026-05-28 |
| SciCode | Coding | 135 | 38.3% | 2026-05-11 |
| SciCode | Coding | 154 | 37% | 2026-05-11 |
| Terminal-Bench 2.0 | Coding | 49 | 24.719% | 2026-05-28 |
| Vibe Code Bench v1.1 | Coding | 40 | 3.506% | 2026-05-28 |
| VibeCodingBench | Coding | 5 | 88.60 | 2026-05-06 |
| SecCodeBench | Cybersecurity | 23 | 51.23% | 2026-05-28 |
| CorpFin v2 | Finance | 67 | 55.944% | 2026-05-28 |
| Finance Agent v1.1 | Finance | 39 | 44.295% | 2026-05-04 |
| TaxEval v2 | Finance | 30 | 73.508% | 2026-05-28 |
| WeirdML | Generalization | 13 | 41.17 | 2026-05-06 |
| MedQA | Healthcare | 58 | 87.367% | 2026-04-16 |
| Artificial Analysis Intelligence Index | Intelligence | 136 | 31.38 | 2026-05-11 |
| Artificial Analysis Intelligence Index | Intelligence | 182 | 26.08 | 2026-05-11 |
| GPQA Diamond | Intelligence | 42 | 79.546% | 2026-05-28 |
| Humanity's Last Exam | Intelligence | 137 | 11.1% | 2026-05-11 |
| Humanity's Last Exam | Intelligence | 169 | 9.3% | 2026-05-11 |
| MMLU Pro | Intelligence | 38 | 84.36% | 2026-05-28 |
| MMLU-Pro | Intelligence | 43 | 84.1% | 2026-05-11 |
| MMLU-Pro | Intelligence | 45 | 83.8% | 2026-05-11 |
| CaseLaw v2 | Legal | 52 | 47.481% | 2026-05-04 |
| LegalBench | Legal | 47 | 81.861% | 2026-05-28 |
| AIME | Math | 46 | 81.042% | 2026-04-16 |
| AIME 2025 | Math | 63 | 80.7% | 2026-05-11 |
| AIME 2025 | Math | 80 | 75% | 2026-05-11 |
| MGSM | Math | 25 | 91.818% | 2026-01-09 |
| OTIS Mock AIME 2024-2025 | Mathematics | 14 | 73.33 | 2026-05-06 |
| Design Arena | Multimodal | 75 | 1170 | 2026-05-06 |
| Artificial Analysis Openness Index | Openness | 189 | 16.67 | 2026-05-11 |
| GPQA Diamond | Reasoning | 144 | 76.4% | 2026-05-11 |
| GPQA Diamond | Reasoning | 145 | 76.4% | 2026-05-11 |
| CritPt | Science | 98 | 0.9% | 2026-05-11 |
| CritPt | Science | 359 | 0% | 2026-05-11 |
| GSO-Bench | Science | 6 | 4.90 | 2026-05-06 |
| IDE-Bench | Software Engineering | 6 | 65 | 2026-05-27 |
| Lech Mazur Writing | Writing | 2 | 8.71 | 2026-05-06 |
No matching rows.