KIMI

MoonshotAI: Kimi K2 0711

Kimi / Moonshot AI

53scores
53benchmarks
$0.57 / $2.3 per 1M tokenscost in/out

Metadata

Kimi Closed/API

Aliases: kimi-k2, moonshotai-kimi-k2, moonshotai/kimi-k2

Benchmark Results

Benchmark Category Rank Score Sampled
ADBench Agentic 7 79 2026-05-06
Berkeley Function-Calling Leaderboard Agentic 11 59.06% 2026-05-27
Galileo Agent Leaderboard Agentic 5 0.53 2026-05-06
LLM-WikiRace Agentic 11 45.30 2026-05-06
Tau2 Airline Agentic 13 0.56 2026-05-06
Tau2-Bench Telecom Agentic 159 61.1% 2026-05-11
Terminal-Bench Hard Agentic 173 15.9% 2026-05-11
OpenUGI Alignment 30 56.55 2026-05-06
IOI Coding 51 1.25% 2026-05-26
LiveCodeBench Coding 61 70.449% 2026-05-28
MultiPL-E Coding 4 0.857 2026-05-27
SciCode Coding 200 34.5% 2026-05-11
Terminal-Bench 2.0 Coding 48 25.843% 2026-05-28
NeoEvalPlusN Creative 64 15.50 2026-05-06
kluster.ai LLM Hallucination Detection Leaderboard Factuality 8 97.03 2026-05-06
CorpFin v2 Finance 84 50.388% 2026-05-28
FinanceArena Finance 15 33.8 2026-05-27
PRBench Finance Finance 16 38.34 2026-05-06
TaxEval v2 Finance 65 70.196% 2026-05-28
BenchLM General Knowledge 75 42 2026-05-06
CSimpleQA General Knowledge 5 0.78 2026-05-06
MMLU-Redux General Knowledge 13 0.93 2026-05-06
HELM AIR-Bench Generalization 31 0.741131 2026-05-28
MedQA Healthcare 62 83.975% 2026-04-16
HUMAINE Human Preference 6 3.71 2026-05-06
AIIQ Composite IQ Intelligence 31 101 2026-05-12
Artificial Analysis Intelligence Index Intelligence 179 26.32 2026-05-11
GPQA Diamond Intelligence 65 71.464% 2026-05-28
Humanity's Last Exam Intelligence 213 7% 2026-05-11
MMLU Pro Intelligence 69 79.394% 2026-05-28
MMLU-Pro Intelligence 68 82.4% 2026-05-11
LegalBench Legal 49 81.454% 2026-05-28
Professional Reasoning Bench - Legal Legal 23 36.38 2026-05-06
Fiction.LiveBench Long Context 16 40.60 2026-05-06
AIME Math 56 62.708% 2026-04-16
AIME 2025 Math 124 57% 2026-05-11
IneqMath Math 20 9 2026-05-06
MATH 500 Math 13 94.2% 2026-01-09
MGSM Math 31 90.946% 2026-01-09
CNMO 2024 Mathematics 1 0.74 2026-05-06
HMMT 2025 Mathematics 28 0.39 2026-05-06
MATH-500 Mathematics 6 0.97 2026-05-06
PolyMath-en Mathematics 1 0.65 2026-05-06
LiveMedBench Medical 29 0.0585 2026-05-27
Artificial Analysis Openness Index Openness 89 44.44 2026-05-11
AutoLogi Reasoning 1 0.90 2026-05-06
GPQA Diamond Reasoning 141 76.6% 2026-05-11
Humanity's Last Exam (Text Only) Reasoning 45 4.68 2026-05-06
MultiNRC Reasoning 31 18.48 2026-05-06
OJBench Reasoning 8 0.27 2026-05-06
CritPt Science 262 0% 2026-05-11
SWE-bench Pro Software Engineering 6 27.67 2026-05-06
ACEBench Tool Use 1 0.77 2026-05-06