Qwen3 VL 235B A22B Instruct

Metadata

Qwen Open source

Aliases: qwen-qwen3-vl-235b-a22b-instruct, qwen/qwen3-vl-235b-a22b-instruct, qwen3-vl-235b-a22b-instruct

Benchmark	Category	Rank	Score	Sampled
Tau2-Bench Telecom	Agentic	215	35.1%	2026-05-11
Terminal-Bench Hard	Agentic	247	6.8%	2026-05-11
OpenUGI	Alignment	202	45.56	2026-05-06
MultiPL-E	Coding	3	0.861	2026-05-27
SciCode	Coding	180	35.9%	2026-05-11
Arena-Hard v2	General Knowledge	5	0.77	2026-05-06
CSimpleQA	General Knowledge	3	0.83	2026-05-06
MMLU-ProX	General Knowledge	11	0.78	2026-05-06
MMLU-Redux	General Knowledge	16	0.92	2026-05-06
Multi-IF	Instruction Following	7	0.76	2026-05-06
Artificial Analysis Intelligence Index	Intelligence	235	20.75	2026-05-11
Humanity's Last Exam	Intelligence	235	6.3%	2026-05-11
MathVision	Intelligence	33	66	2026-05-06
MMLU-Pro	Intelligence	71	82.3%	2026-05-11
AIME 2025	Math	95	70.7%	2026-05-11
BLINK	Multimodal	1	0.71	2026-05-06
CC-OCR	Multimodal	2	0.82	2026-05-06
CharadesSTA	Multimodal	1	0.65	2026-05-06
CharXiv-R	Multimodal	23	0.62	2026-05-06
InfoVQAtest	Multimodal	3	0.89	2026-05-06
LVBench	Multimodal	6	0.68	2026-05-06
Math-VR	Multimodal	2	65.0	2026-05-27
MLVU	Multimodal	7	0.84	2026-05-06
MM-MT-Bench	Multimodal	2	8.50	2026-05-06
MMLongBench-Doc	Multimodal	6	57	2026-05-06
MuirBench	Multimodal	6	0.73	2026-05-06
Physical AI Bench Understanding	Multimodal	4	64.70	2026-05-06
VideoMME w/o sub.	Multimodal	5	0.79	2026-05-06
VideoMMMU	Multimodal	18	0.75	2026-05-06
OCRBench-V2 (en)	OCR	3	0.67	2026-05-06
OCRBench-V2 (zh)	OCR	3	0.62	2026-05-06
Artificial Analysis Openness Index	Openness	61	50	2026-05-11
ERQA	Reasoning	11	0.51	2026-05-06
GPQA Diamond	Reasoning	195	71.2%	2026-05-11
CritPt	Science	366	0%	2026-05-11
BFCL-v3	Tool Use	13	0.68	2026-05-06
ODinW	Vision	3	0.49	2026-05-06
K-MetBench	Weather	18	72.4% accuracy	2026-05-28
Creative Writing v3	Writing	4	0.86	2026-05-06
WritingBench	Writing	5	0.85	2026-05-06

Metadata

Benchmark Results