Qwen3 8B

Qwen / Qwen

54scores
41benchmarks
$0.05 / $0.4 per 1M tokenscost in/out

Metadata

Qwen Open source

Aliases: qwen-qwen3-8b, qwen-qwen3-8b-04-28, qwen/qwen3-8b, qwen/qwen3-8b-04-28, qwen3-8b, qwen3-8b-04-28

Benchmark Results

Benchmark Category Rank Score Sampled
ADBench Agentic 10 58 2026-05-06
AMA-Bench Agentic 13 0.41 2026-05-06
Berkeley Function-Calling Leaderboard Agentic 39 42.57% 2026-05-27
Berkeley Function-Calling Leaderboard Agentic 44 40.43% 2026-05-27
Tau2-Bench Telecom Agentic 260 27.8% 2026-05-11
Tau2-Bench Telecom Agentic 284 24.9% 2026-05-11
Terminal-Bench Hard Agentic 319 2.3% 2026-05-11
Terminal-Bench Hard Agentic 320 2.3% 2026-05-11
OpenUGI Alignment 705 32.43 2026-05-06
OpenUGI Alignment 719 32.18 2026-05-06
Stick To Your Role! Alignment 13 0.72 2026-05-06
BTZSC Classification 3 66.49 2026-05-06
ABC-Bench Coding 11 8.3% +/- 1.1 2026-05-27
SciCode Coding 351 22.6% 2026-05-11
SciCode Coding 392 16.8% 2026-05-11
TuRTLe Code Completion (Icarus Verilog) Coding 21 48.16 2026-05-06
TuRTLe Code Completion (Verilator) Coding 22 48.82 2026-05-06
TuRTLe Spec-to-RTL (Icarus Verilog) Coding 23 45.10 2026-05-06
TuRTLe Spec-to-RTL (Verilator) Coding 23 46.23 2026-05-06
RedSage-Bench Cybersecurity 9 81.85% 2026-05-28
MMTU Data 17 0.48 2026-05-06
MMTU Data 24 0.35 2026-05-06
GSMA Open Telco Leaderboard Domain 68 41.07 2026-05-06
EduGuardBench Education 13 0.61 2026-05-27
Vectara HHEM Hallucination Leaderboard Factuality 9 95.20 2026-05-06
Fin-RATE Finance 15 5.48% 2026-05-28
FinToolBench Finance 1 0.4234 2026-05-27
GeoCode Leaderboard Geospatial 21 48.35% pass@1 2026-05-28
HealthBench Hard Healthcare 5 0.56 2026-05-27
Artificial Analysis Intelligence Index Intelligence 363 13.18 2026-05-11
Artificial Analysis Intelligence Index Intelligence 411 10.63 2026-05-11
FACTS Grounding Intelligence 10 0.40 2026-05-06
Humanity's Last Exam Intelligence 401 4.2% 2026-05-11
Humanity's Last Exam Intelligence 477 2.8% 2026-05-11
MMLU-Pro Intelligence 188 74.3% 2026-05-11
MMLU-Pro Intelligence 255 64.3% 2026-05-11
AraGen v3 Language 45 26.55 2026-05-06
La Leaderboard Language 52 13.53 2026-05-06
Open Arabic LLM Leaderboard Language 128 42.41 2026-05-06
Open Portuguese LLM Leaderboard Language 83 84.48 2026-05-06
Ukrainian LLM Leaderboard Language 13 12.18 2026-05-06
J1-ENVS Legal 14 42.48 2026-05-26
AIME 2025 Math 199 24.3% 2026-05-11
AIME 2025 Math 212 19% 2026-05-11
MEDIC Benchmark Medical 74 55.41 average normalized public table score 2026-05-27
Medmarks Medical 12 0.3634064599826541 2026-05-27
Medmarks Medical 38 0.5019479387680824 2026-05-27
FLORES European Languages Leaderboard Multilingual 9 43.25 2026-05-06
INCLUDE-base-44 European Languages Multilingual 8 0.61 2026-05-06
GPQA Diamond Reasoning 288 58.9% 2026-05-11
GPQA Diamond Reasoning 370 45.2% 2026-05-11
CritPt Science 354 0% 2026-05-11
CritPt Science 355 0% 2026-05-11
K-MetBench Weather 21 70.1% accuracy 2026-05-28