Qwen2.5 7B Instruct

Qwen / Qwen

39scores
33benchmarks
$0.04 / $0.1 per 1M tokenscost in/out

Metadata

Qwen Open source

Aliases: qwen-2.5-7b-instruct, qwen-qwen-2.5-7b-instruct, qwen/qwen-2.5-7b-instruct

Benchmark Results

Benchmark Category Rank Score Sampled
PinchBench Agentic 64 0.40 2026-05-06
UAVBench Agentic 22 66.05 2026-05-06
OpenUGI Alignment 1004 23.75 2026-05-06
BigCodeBench Coding 69 37.60 2026-05-06
MultiPL-E Coding 9 0.704 2026-05-27
TuRTLe Module Completion (NotSoTiny) Coding 14 3.48 2026-05-06
MMTU Data 25 0.31 2026-05-06
BizFinBench Finance 19 56.35 2026-05-27
FinEval Finance 31 62.3 2026-05-27
Open FinLLM Leaderboard Finance 16 17.365888% 2026-05-27
AlignBench General Knowledge 3 0.73 2026-05-06
MMLU-Redux General Knowledge 40 0.75 2026-05-06
Open LLM Leaderboard v2 General Knowledge 623 35.20 2026-05-06
HealthBench Hard Healthcare 18 0.42 2026-05-27
FACTS Grounding Intelligence 21 0.26 2026-05-06
MuSR Intelligence 2713 8.45 2026-05-06
AraGen v3 Language 46 26.25 2026-05-06
La Leaderboard Language 17 23.66 2026-05-06
Open Arabic LLM Leaderboard Language 79 59.80 2026-05-06
Open Japanese LLM Leaderboard Language 393 53.04 2026-05-06
Open Japanese LLM Leaderboard Language 697 28.96 2026-05-06
Open Ko-LLM Leaderboard Language 72 46.80 2026-05-06
Open Ko-LLM Leaderboard Language 75 46.74 2026-05-06
Open Portuguese LLM Leaderboard Language 176 82.68 2026-05-06
J1-ENVS Legal 8 51.35 2026-05-26
MATH Level 5 Math 138 50 2026-05-06
BRIDGE Medical Leaderboard Medical 81 41.6 2026-05-27
BRIDGE Medical Leaderboard Medical 218 31.32 2026-05-27
BRIDGE Medical Leaderboard Medical 225 30.25 2026-05-27
MEDIC Benchmark Medical 37 64.83 average normalized public table score 2026-05-27
FLORES European Languages Leaderboard Multilingual 13 39.84 2026-05-06
INCLUDE-base-44 European Languages Multilingual 15 0.54 2026-05-06
ZebraLogic Reasoning 51 12 2026-05-06
ThaiSafetyBench Safety 7 14.43% overall ASR 2026-05-28
JSONSchemaBench Structured Output 32 54.8% schema compliance 2026-05-28
JSONSchemaBench Structured Output 39 33.2% schema compliance 2026-05-28
JSONSchemaBench Structured Output 45 5.38% schema compliance 2026-05-28
StructEval Structured Output 9 59.03% 2026-05-28
VNTL Leaderboard Translation 39 65.18 2026-05-06