Qwen3 VL 8B Instruct | BenchmarkList

Metadata

Qwen Open source

Aliases: qwen-qwen3-vl-8b-instruct, qwen/qwen3-vl-8b-instruct, qwen3-vl-8b-instruct

Benchmark	Category	Rank	Score	Sampled
Tau2-Bench Telecom	Agentic	247	29.2%	2026-05-11
Terminal-Bench Hard	Agentic	321	2.3%	2026-05-11
UAVBench	Agentic	13	75.95	2026-05-06
OpenUGI	Alignment	690	32.78	2026-05-06
SciCode	Coding	388	17.4%	2026-05-11
VLMbench Performance	Efficiency	5	953.8 tok/s	2026-05-28
MMLU-ProX	General Knowledge	20	0.65	2026-05-06
MMLU-Redux	General Knowledge	31	0.85	2026-05-06
GeoRC	Geospatial	13	23.81	2026-05-27
Multi-IF	Instruction Following	9	0.75	2026-05-06
Artificial Analysis Intelligence Index	Intelligence	343	14.3	2026-05-11
Humanity's Last Exam	Intelligence	475	2.9%	2026-05-11
MMLU-Pro	Intelligence	233	68.6%	2026-05-11
AIME 2025	Math	191	27.3%	2026-05-11
PolyMATH	Mathematics	18	0.30	2026-05-06
LatamBoard	Multilingual	10	68.82	2026-05-06
BLINK	Multimodal	2	0.69	2026-05-06
CC-OCR	Multimodal	11	0.80	2026-05-06
CharadesSTA	Multimodal	9	0.56	2026-05-06
CharXiv-D	Multimodal	11	0.83	2026-05-06
CharXiv-R	Multimodal	32	0.46	2026-05-06
InfoVQAtest	Multimodal	9	0.83	2026-05-06
LVBench	Multimodal	12	0.58	2026-05-06
MDPBench	Multimodal	6	68.30	2026-05-06
MM-MT-Bench	Multimodal	9	7.70	2026-05-06
MuirBench	Multimodal	8	0.64	2026-05-06
ParseBench	Multimodal	7	46.80	2026-05-06
Physical AI Bench Understanding	Multimodal	13	56.80	2026-05-06
VideoMMMU	Multimodal	22	0.65	2026-05-06
OCRBench-V2 (en)	OCR	5	0.65	2026-05-06
OCRBench-V2 (zh)	OCR	4	0.61	2026-05-06
Artificial Analysis Openness Index	Openness	69	50	2026-05-11
ERQA	Reasoning	15	0.46	2026-05-06
GPQA Diamond	Reasoning	381	42.7%	2026-05-11
CritPt	Science	373	0%	2026-05-11
BFCL-v3	Tool Use	15	0.66	2026-05-06
ODinW	Vision	7	0.45	2026-05-06
K-MetBench	Weather	39	53.8% accuracy	2026-05-28
WritingBench	Writing	11	0.83	2026-05-06

Metadata

Benchmark Results