GLM

GLM 4.6

GLM / Z.ai

50scores
39benchmarks
$0.39 / $1.9 per 1M tokenscost in/out

Metadata

GLM Open source

Aliases: glm-4.6, z-ai-glm-4.6, z-ai/glm-4.6

Benchmark Results

Benchmark Category Rank Score Sampled
APEX-Agents Agentic 34 11.80 2026-05-06
Berkeley Function-Calling Leaderboard Agentic 4 72.38% 2026-05-27
MCP-Universe Agentic 12 25.97 2026-05-06
Poker Agent Agentic 16 945.756% 2025-12-23
Tau2-Bench Telecom Agentic 115 76.9% 2026-05-11
Tau2-Bench Telecom Agentic 136 70.5% 2026-05-11
Terminal-Bench Hard Agentic 101 28.8% 2026-05-11
Terminal-Bench Hard Agentic 122 25% 2026-05-11
UAVBench Agentic 30 41.70 2026-05-06
OpenUGI Alignment 114 49.83 2026-05-06
OpenUGI Alignment 295 42.54 2026-05-06
ALE-Bench Coding 77 340.82 2026-05-06
Arena AI Code Coding 46 1355 2026-05-06
IOI Coding 40 4.334% 2026-05-26
LiveCodeBench Coding 40 81.036% 2026-05-28
SciCode Coding 130 38.4% 2026-05-11
SciCode Coding 219 33.1% 2026-05-11
Terminal-Bench 2.0 Coding 46 28.09% 2026-05-28
Vibe Code Bench v1.1 Coding 41 3.09% 2026-05-28
NeoEvalPlusN Creative 27 18 2026-05-06
AI Energy Score Efficiency 197 1 2026-05-06
Vectara HHEM Hallucination Leaderboard Factuality 50 90.50 2026-05-06
CorpFin v2 Finance 65 56.838% 2026-05-28
Finance Agent v1.1 Finance 43 36.48% 2026-05-04
TaxEval v2 Finance 85 66.235% 2026-05-28
MedQA Healthcare 34 92.225% 2026-04-16
Artificial Analysis Intelligence Index Intelligence 126 32.51 2026-05-11
Artificial Analysis Intelligence Index Intelligence 147 30.24 2026-05-11
GPQA Diamond Intelligence 57 74.495% 2026-05-28
Humanity's Last Exam Intelligence 106 13.3% 2026-05-11
Humanity's Last Exam Intelligence 277 5.2% 2026-05-11
MMLU Pro Intelligence 52 82.203% 2026-05-28
MMLU-Pro Intelligence 61 82.9% 2026-05-11
MMLU-Pro Intelligence 138 78.4% 2026-05-11
LegalBench Legal 63 79.608% 2026-05-28
ConStory-Bench Long Context 5 CED 0.528 2026-05-28
AIME Math 17 92.708% 2026-04-16
AIME 2025 Math 44 86% 2026-05-11
AIME 2025 Math 149 44.3% 2026-05-11
MGSM Math 45 89.746% 2026-01-09
LiveMedBench Medical 10 0.1759 2026-05-27
Design Arena Multimodal 50 1224 2026-05-06
Artificial Analysis Openness Index Openness 82 44.44 2026-05-11
Artificial Analysis Openness Index Openness 83 44.44 2026-05-11
GPQA Diamond Reasoning 122 78% 2026-05-11
GPQA Diamond Reasoning 259 63.2% 2026-05-11
LiveSecBench Safety 22 44.87 2026-05-27
CritPt Science 81 1.1% 2026-05-11
CritPt Science 206 0% 2026-05-11
SWE-bench Pro Software Engineering 12 9.67 2026-05-06