KIMI

MoonshotAI: Kimi K2 0905

Kimi / Moonshot AI

38scores
38benchmarks
$0.4 / $2 per 1M tokenscost in/out

Metadata

Kimi Closed/API

Aliases: kimi-k2-0905, moonshotai-kimi-k2-0905, moonshotai/kimi-k2-0905

Benchmark Results

Benchmark Category Rank Score Sampled
MCP-Universe Agentic 21 19.91 2026-05-06
MCPMark Agentic 21 0.22 2026-05-06
Tau2 Airline Agentic 13 0.56 2026-05-06
Tau2-Bench Telecom Agentic 127 73.4% 2026-05-11
Terminal-Bench Hard Agentic 131 23.5% 2026-05-11
UAVBench Agentic 7 77.75 2026-05-06
VitaBench Agentic 24 11.50 2026-05-06
OpenUGI Alignment 70 52.15 2026-05-06
ALE-Bench Coding 81 267.13 2026-05-06
MultiPL-E Coding 4 0.857 2026-05-27
SciCode Coding 240 30.7% 2026-05-11
TuRTLe Code Completion (Icarus Verilog) Coding 9 71.77 2026-05-06
TuRTLe Code Completion (Verilator) Coding 9 71.79 2026-05-06
TuRTLe Line Completion Coding 6 33.65 2026-05-06
TuRTLe Module Completion (NotSoTiny) Coding 6 19.00 2026-05-06
TuRTLe Spec-to-RTL (Icarus Verilog) Coding 9 68.72 2026-05-06
TuRTLe Spec-to-RTL (Verilator) Coding 9 67.80 2026-05-06
NeoEvalPlusN Creative 56 16 2026-05-06
Vectara HHEM Hallucination Leaderboard Factuality 91 82.10 2026-05-06
MageBench Season 1 Game 26 1558 rating / 5 games 2026-05-28
MMLU-Redux General Knowledge 13 0.93 2026-05-06
Artificial Analysis Intelligence Index Intelligence 141 30.85 2026-05-11
Humanity's Last Exam Intelligence 233 6.3% 2026-05-11
MMLU-Pro Intelligence 79 81.9% 2026-05-11
AIME 2025 Math 123 57.3% 2026-05-11
CNMO 2024 Mathematics 1 0.74 2026-05-06
HMMT 2025 Mathematics 28 0.39 2026-05-06
MATH-500 Mathematics 6 0.97 2026-05-06
PolyMath-en Mathematics 1 0.65 2026-05-06
Design Arena Multimodal 79 1156 2026-05-06
Artificial Analysis Openness Index Openness 174 27.78 2026-05-11
AutoLogi Reasoning 1 0.90 2026-05-06
GPQA Diamond Reasoning 139 76.7% 2026-05-11
OJBench Reasoning 8 0.27 2026-05-06
EvasionBench Safety 5 66.68 2026-05-06
ChemBench Science 17 0.60 2026-05-06
CritPt Science 263 0% 2026-05-11
ACEBench Tool Use 1 0.77 2026-05-06