Qwen3.5-122B-A10B | BenchmarkList

Metadata

Qwen Open source

Aliases: qwen-qwen3.5-122b-a10b, qwen-qwen3.5-122b-a10b-20260224, qwen/qwen3.5-122b-a10b, qwen/qwen3.5-122b-a10b-20260224, qwen3.5-122b-a10b, qwen3.5-122b-a10b-20260224

Benchmark	Category	Rank	Score	Sampled
AutoBench	Agentic	17	2.84	2026-05-06
DeepPlanning	Agentic	4	0.24	2026-05-06
OSWorld-Verified	Agentic	9	0.58	2026-05-06
PinchBench	Agentic	25	0.85	2026-05-06
ScreenSpot-Pro	Agentic	4	70.40	2026-05-06
t2-bench	Agentic	13	0.80	2026-05-06
Tau2-Bench Telecom	Agentic	34	93.6%	2026-05-11
Tau2-Bench Telecom	Agentic	90	84.5%	2026-05-11
Terminal-Bench Hard	Agentic	95	31.1%	2026-05-11
Terminal-Bench Hard	Agentic	99	29.5%	2026-05-11
TIR-Bench	Agentic	4	0.53	2026-05-06
YC-Bench	Agentic	11	0	2026-05-06
OpenUGI	Alignment	1108	18.07	2026-05-06
OpenUGI	Alignment	1111	17.71	2026-05-06
Arena AI Code	Coding	44	1363	2026-05-06
Codeforces	Coding	4	0.851	2026-05-28
FullStackBench en	Coding	1	0.63	2026-05-06
FullStackBench zh	Coding	1	0.59	2026-05-06
SciCode	Coding	69	42%	2026-05-11
SciCode	Coding	187	35.6%	2026-05-11
OmniDocBench 1.5	Document Understanding	3	0.90	2026-05-06
EmbSpatialBench	Embodied	5	0.84	2026-05-06
Vectara HHEM Hallucination Leaderboard	Factuality	68	88.80	2026-05-06
BenchLM	General Knowledge	39	65	2026-05-06
MAXIFE	General Knowledge	4	0.88	2026-05-06
MMLU-ProX	General Knowledge	3	0.82	2026-05-06
MMLU-Redux	General Knowledge	4	0.94	2026-05-06
NOVA-63	General Knowledge	2	0.59	2026-05-06
MedXpertQA	Healthcare	2	0.67	2026-05-06
PMC-VQA	Healthcare	1	0.63	2026-05-06
SlakeVQA	Healthcare	1	0.82	2026-05-06
IFBench	Instruction Following	4	0.76	2026-05-06
Artificial Analysis Intelligence Index	Intelligence	65	41.6	2026-05-11
Artificial Analysis Intelligence Index	Intelligence	104	35.87	2026-05-11
Humanity's Last Exam	Intelligence	55	23.4%	2026-05-11
Humanity's Last Exam	Intelligence	97	14.8%	2026-05-11
MathVision	Intelligence	9	86.20	2026-05-06
AA-LCR	Long Context	5	0.67	2026-05-06
DynaMath	Mathematics	3	0.86	2026-05-06
HMMT 2025	Mathematics	14	0.91	2026-05-06
PolyMATH	Mathematics	4	0.69	2026-05-06
BabyVision	Multimodal	3	0.40	2026-05-06
CC-OCR	Multimodal	4	0.82	2026-05-06
CharXiv-R	Multimodal	17	0.77	2026-05-06
LingoQA	Multimodal	2	0.81	2026-05-06
LVBench	Multimodal	2	0.74	2026-05-06
MLVU	Multimodal	1	0.87	2026-05-06
MMVU	Multimodal	2	0.75	2026-05-06
Nuscene	Multimodal	1	0.15	2026-05-06
SimpleVQA	Multimodal	5	0.62	2026-05-06
VideoMME w sub.	Multimodal	2	0.87	2026-05-06
VideoMME w/o sub.	Multimodal	1	0.84	2026-05-06
VideoMMMU	Multimodal	13	0.82	2026-05-06
VLMsAreBlind	Multimodal	4	0.97	2026-05-06
ZEROBench	Multimodal	4	0.09	2026-05-06
ZEROBench-Sub	Multimodal	1	0.36	2026-05-06
Artificial Analysis Openness Index	Openness	135	38.89	2026-05-11
Artificial Analysis Openness Index	Openness	136	38.89	2026-05-11
ERQA	Reasoning	7	0.62	2026-05-06
Global PIQA	Reasoning	5	0.88	2026-05-06
GPQA Diamond	Reasoning	48	85.7%	2026-05-11
GPQA Diamond	Reasoning	80	82.7%	2026-05-11
OJBench	Reasoning	4	0.40	2026-05-06
CritPt	Science	99	0.9%	2026-05-11
CritPt	Science	117	0.6%	2026-05-11
BrowseComp-zh	Search	2	0.70	2026-05-06
Seal-0	Search	5	0.44	2026-05-06
WideSearch	Search	6	0.60	2026-05-06
CountBench	Spatial Reasoning	5	0.97	2026-05-06
Hypersim	Spatial Reasoning	3	0.13	2026-05-06
RefCOCO-avg	Spatial Reasoning	5	0.91	2026-05-06
RefSpatialBench	Spatial Reasoning	3	0.69	2026-05-06
SUNRGBD	Spatial Reasoning	1	0.36	2026-05-06
BFCL-V4	Tool Use	2	0.72	2026-05-06
WMT24++	Translation	5	0.78	2026-05-06
ODinW	Vision	8	0.45	2026-05-06

Metadata

Benchmark Results