Qwen3 32B | BenchmarkList

Metadata

Qwen Open source

Aliases: qwen-qwen3-32b, qwen-qwen3-32b-04-28, qwen/qwen3-32b, qwen/qwen3-32b-04-28, qwen3-32b, qwen3-32b-04-28

Benchmark	Category	Rank	Score	Sampled
ADBench	Agentic	9	59	2026-05-06
AgentIF	Agentic	3	58.4	2026-05-27
AMA-Bench	Agentic	6	0.51	2026-05-06
Berkeley Function-Calling Leaderboard	Agentic	29	48.71%	2026-05-27
Berkeley Function-Calling Leaderboard	Agentic	33	46.78%	2026-05-27
CAR-bench	Agentic	10	0.31	2026-05-06
Tau2-Bench Telecom	Agentic	243	29.8%	2026-05-11
Terminal-Bench Hard	Agentic	304	3%	2026-05-11
VitaBench	Agentic	14	5.30	2026-05-06
VitaBench	Agentic	25	4	2026-05-06
WildAgtEval	Agentic	8	49.9%	2026-05-28
OpenUGI	Alignment	778	31.02	2026-05-06
OpenUGI	Alignment	981	25.03	2026-05-06
Stick To Your Role!	Alignment	9	0.74	2026-05-06
ABC-Bench	Coding	10	8.9% +/- 1.1	2026-05-27
Codeforces	Coding	13	0.659	2026-05-28
SciCode	Coding	190	35.4%	2026-05-11
SciCode	Coding	282	28%	2026-05-11
NeoEvalPlusN	Creative	143	9.50	2026-05-06
RedSage-Bench	Cybersecurity	3	85.4%	2026-05-28
MMTU	Data	14	0.51	2026-05-06
MMTU	Data	23	0.38	2026-05-06
GSMA Open Telco Leaderboard	Domain	51	46.77	2026-05-06
EduGuardBench	Education	6	0.72	2026-05-27
Vectara HHEM Hallucination Leaderboard	Factuality	24	94.10	2026-05-06
BizFinBench	Finance	8	68.26	2026-05-27
Arena-Hard	Generalization	16	44.5%	2026-05-27
GeoCode Leaderboard	Geospatial	12	60.67% pass@1	2026-05-28
HealthBench Hard	Healthcare	9	0.5	2026-05-27
Artificial Analysis Intelligence Index	Intelligence	292	16.53	2026-05-11
Artificial Analysis Intelligence Index	Intelligence	334	14.53	2026-05-11
FACTS Grounding	Intelligence	7	0.42	2026-05-06
Humanity's Last Exam	Intelligence	182	8.3%	2026-05-11
Humanity's Last Exam	Intelligence	386	4.3%	2026-05-11
MMLU-Pro	Intelligence	120	79.8%	2026-05-11
MMLU-Pro	Intelligence	200	72.7%	2026-05-11
TableBench	Intelligence	13	52.45%	2026-05-27
AraGen v3	Language	34	35.18	2026-05-06
Open Arabic LLM Leaderboard	Language	131	41.17	2026-05-06
Open Portuguese LLM Leaderboard	Language	55	85.43	2026-05-06
J1-ENVS	Legal	3	59.14	2026-05-26
LEXam	Legal	26	40.00% open / 45.30% MCQ	2026-05-28
ConStory-Bench	Long Context	6	CED 0.537	2026-05-28
AIME 2025	Math	88	73%	2026-05-11
AIME 2025	Math	210	19.7%	2026-05-11
MEDIC Benchmark	Medical	71	55.71 average normalized public table score	2026-05-27
LanguageBench	Multilingual	29	0.14	2026-05-06
GPQA Diamond	Reasoning	233	66.8%	2026-05-11
GPQA Diamond	Reasoning	323	53.5%	2026-05-11
CritPt	Science	148	0.3%	2026-05-11
SciPredict	Science	9	17.04	2026-05-06
K-MetBench	Weather	46	47.5% accuracy	2026-05-28

Metadata

Benchmark Results