Qwen3 VL 235B A22B Instruct
Qwen / Qwen
40scores
40benchmarks
$0.2 / $0.88 per 1M tokenscost in/out
Metadata
Qwen Open source
Aliases: qwen-qwen3-vl-235b-a22b-instruct, qwen/qwen3-vl-235b-a22b-instruct, qwen3-vl-235b-a22b-instruct
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| Tau2-Bench Telecom | Agentic | 215 | 35.1% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 247 | 6.8% | 2026-05-11 |
| OpenUGI | Alignment | 202 | 45.56 | 2026-05-06 |
| MultiPL-E | Coding | 3 | 0.861 | 2026-05-27 |
| SciCode | Coding | 180 | 35.9% | 2026-05-11 |
| Arena-Hard v2 | General Knowledge | 5 | 0.77 | 2026-05-06 |
| CSimpleQA | General Knowledge | 3 | 0.83 | 2026-05-06 |
| MMLU-ProX | General Knowledge | 11 | 0.78 | 2026-05-06 |
| MMLU-Redux | General Knowledge | 16 | 0.92 | 2026-05-06 |
| Multi-IF | Instruction Following | 7 | 0.76 | 2026-05-06 |
| Artificial Analysis Intelligence Index | Intelligence | 235 | 20.75 | 2026-05-11 |
| Humanity's Last Exam | Intelligence | 235 | 6.3% | 2026-05-11 |
| MathVision | Intelligence | 33 | 66 | 2026-05-06 |
| MMLU-Pro | Intelligence | 71 | 82.3% | 2026-05-11 |
| AIME 2025 | Math | 95 | 70.7% | 2026-05-11 |
| BLINK | Multimodal | 1 | 0.71 | 2026-05-06 |
| CC-OCR | Multimodal | 2 | 0.82 | 2026-05-06 |
| CharadesSTA | Multimodal | 1 | 0.65 | 2026-05-06 |
| CharXiv-R | Multimodal | 23 | 0.62 | 2026-05-06 |
| InfoVQAtest | Multimodal | 3 | 0.89 | 2026-05-06 |
| LVBench | Multimodal | 6 | 0.68 | 2026-05-06 |
| Math-VR | Multimodal | 2 | 65.0 | 2026-05-27 |
| MLVU | Multimodal | 7 | 0.84 | 2026-05-06 |
| MM-MT-Bench | Multimodal | 2 | 8.50 | 2026-05-06 |
| MMLongBench-Doc | Multimodal | 6 | 57 | 2026-05-06 |
| MuirBench | Multimodal | 6 | 0.73 | 2026-05-06 |
| Physical AI Bench Understanding | Multimodal | 4 | 64.70 | 2026-05-06 |
| VideoMME w/o sub. | Multimodal | 5 | 0.79 | 2026-05-06 |
| VideoMMMU | Multimodal | 18 | 0.75 | 2026-05-06 |
| OCRBench-V2 (en) | OCR | 3 | 0.67 | 2026-05-06 |
| OCRBench-V2 (zh) | OCR | 3 | 0.62 | 2026-05-06 |
| Artificial Analysis Openness Index | Openness | 61 | 50 | 2026-05-11 |
| ERQA | Reasoning | 11 | 0.51 | 2026-05-06 |
| GPQA Diamond | Reasoning | 195 | 71.2% | 2026-05-11 |
| CritPt | Science | 366 | 0% | 2026-05-11 |
| BFCL-v3 | Tool Use | 13 | 0.68 | 2026-05-06 |
| ODinW | Vision | 3 | 0.49 | 2026-05-06 |
| K-MetBench | Weather | 18 | 72.4% accuracy | 2026-05-28 |
| Creative Writing v3 | Writing | 4 | 0.86 | 2026-05-06 |
| WritingBench | Writing | 5 | 0.85 | 2026-05-06 |
No matching rows.