KIMI

MoonshotAI: Kimi K2.6

Kimi / Moonshot AI

122scores
104benchmarks
$0.74 / $3.49 per 1M tokenscost in/out

Metadata

Kimi Closed/API

Aliases: kimi-k2.6, kimi-k2.6-20260420, moonshotai-kimi-k2.6, moonshotai-kimi-k2.6-20260420, moonshotai/kimi-k2.6, moonshotai/kimi-k2.6-20260420, K2.6 Thinking, Kimi K2.6 Thinking, kimi-k2.6-thinking, moonshotai/kimi-k2.6-thinking

Benchmark Results

Benchmark Category Rank Score Sampled
CoWorkBench Agentic 6 58.2% 2026-05-28
Gert Labs Rankings Agentic 5 0.66 2026-05-11
HiL-Bench Agentic 6 14.67% 2026-05-05
ITBench-AA Agentic 16 31.2% 2026-05-28
MCP Atlas Agentic 6 66.6% 2026-05-28
MCPMark Agentic 5 55.9% 2026-05-28
OSWorld Agentic 8 73.06% 2026-05-27
OSWorld-Verified Agentic 5 0.73 2026-05-06
QwenClawBench Agentic 6 54.7% 2026-05-28
QwenWorldBench Agentic 4 50.9% 2026-05-28
Tau2-Bench Telecom Agentic 14 95.9% 2026-05-11
Tau2-Bench Telecom Agentic 32 93.9% 2026-05-11
Terminal-Bench Hard Agentic 22 43.9% 2026-05-11
Terminal-Bench Hard Agentic 49 37.9% 2026-05-11
TERMS-Bench Agentic 10 59.7% SE+ 2026-05-28
Toolathlon Agentic 4 0.50 2026-05-06
Vending-Bench 2 Agentic 5 6204.57 2026-05-28
VitaBench Agentic 5 39.1% 2026-05-28
YC-Bench Agentic 4 511137 2026-05-06
OpenUGI Alignment 90 51.08 2026-05-06
OpenUGI Alignment 529 36.07 2026-05-06
ALE-Bench Coding 22 1092.67 2026-05-06
Arena AI Code Coding 7 1525 2026-05-06
BLXBench Coding 20 15.40 2026-05-06
Claw-Eval Coding 4 61.5% 2026-05-28
Claw-Eval Coding 1 0.81 2026-05-06
DeepSWE Coding 8 23.89 2026-05-26
Kernel Bench L3 Coding 4 1.41/80% 2026-05-28
LiveCodeBench Coding 3 89.6% 2026-05-28
LiveCodeBench Coding 8 86.771% 2026-05-28
LMArena WebDev Arena Coding 7 1524.58 2026-05-06
NL2Repo Coding 3 42.8% 2026-05-28
QwenSVG Coding 6 1325 2026-05-28
SciCode Coding 2 52.2% 2026-05-28
SciCode Coding 9 53.5% 2026-05-11
SciCode Coding 109 39.5% 2026-05-11
SkillsBench Coding 2 56.2% 2026-05-28
SWE-bench Verified Coding 14 76.2% 2026-05-28
Terminal-Bench 2.0 Coding 13 57.303% 2026-05-28
Terminal-Bench 2.0 Coding 3 66.7% 2026-05-28
Terminal-Bench 2.1 Coding 9 53.558% 2026-05-28
Vibe Code Bench v1.1 Coding 14 37.891% 2026-05-28
ExploitBench v8-bench Cybersecurity 12 2.63 points 2026-05-15
ExploitBench v8-bench Cybersecurity 14 2.44 points 2026-05-15
Arena AI Document Document AI 10 1457 2026-05-06
SAGE Education 9 50.224% 2026-05-28
AA-Omniscience Factuality 8 6.42 2026-05-11
CorpFin v2 Finance 7 66.744% 2026-05-28
Finance Agent v1.1 Finance 12 57.056% 2026-05-04
Finance Agent v2 Finance 8 44.866% 2026-05-28
MortgageTax Finance 25 65.818% 2026-05-28
Rogo Big Finance Bench Finance 6 45% rubric / 27% final 2026-05-28
TaxEval v2 Finance 17 74.652% 2026-05-28
InfiniteBM Heads-Up No-Limit Hold'em Game 28 1013.57 Elo / 116 games 2026-05-28
InfiniteBM Liar's Dice Game 30 1036.29 Elo / 1715 games 2026-05-28
BenchLM General Knowledge 12 85 2026-05-06
MAXIFE General Knowledge 5 87.7% 2026-05-28
MMLU-ProX General Knowledge 6 83.7% 2026-05-28
MMLU-Redux General Knowledge 1 95.3% 2026-05-28
NOVA-63 General Knowledge 4 56.7% 2026-05-28
LMArena Text Arena Generalization 18 1454.64 2026-05-06
MedCode Healthcare 31 40.142% 2026-05-28
MedScribe Healthcare 26 78.149% 2026-05-28
PhysicianBench Healthcare 7 17.0 +/- 2.6 2026-05-27
IFBench Instruction Following 4 76% 2026-05-28
IFEval Instruction Following 2 94.5% 2026-05-28
AIIQ Composite IQ Intelligence 11 122 2026-05-12
Artificial Analysis Intelligence Index Intelligence 7 53.9 2026-05-11
Artificial Analysis Intelligence Index Intelligence 55 42.95 2026-05-11
GPQA Diamond Intelligence 14 89.142% 2026-05-28
HLE w/ tools Intelligence 1 54% 2026-05-28
Humanity's Last Exam Intelligence 4 36.4% 2026-05-28
Humanity's Last Exam Intelligence 12 35.9% 2026-05-11
Humanity's Last Exam Intelligence 81 18.2% 2026-05-11
LiveBench Intelligence 23 72.39 2026-05-05
MathVision Intelligence 3 93.20 2026-05-06
MathVision Intelligence 7 87.40 2026-05-06
MMLU Pro Intelligence 12 87.572% 2026-05-28
MMLU-Pro Intelligence 5 87.1% 2026-05-28
MMMU Pro Intelligence 10 86.301% 2026-05-28
SuperGPQA Intelligence 4 71.3% 2026-05-28
Vals Index Intelligence 8 55.551% 2026-05-28
Vals Multimodal Index Intelligence 6 56.788% 2026-05-28
CaseLaw v2 Legal 22 61.201% 2026-05-04
LegalBench Legal 12 84.738% 2026-05-28
MRCR-v2 128k Long Context 5 63.1% 2026-05-28
ProofBench Math 18 16% 2026-05-28
HMMT February 2026 Mathematics 4 92.7% 2026-05-28
IMO-AnswerBench Mathematics 3 86% 2026-05-28
IMO-AnswerBench Mathematics 3 0.86 2026-05-06
MathArena Apex Mathematics 4 24% 2026-05-28
INCLUDE Multilingual 6 84.2% 2026-05-28
MMMLU Multilingual 5 87.5% 2026-05-28
BabyVision Multimodal 1 0.69 2026-05-06
Blueprint-Bench 2 Multimodal 9 0.557 +/- 0.015 2026-05-28
CharXiv-R Multimodal 3 0.87 2026-05-06
Design Arena Multimodal 4 1342 2026-05-06
LMArena Vision Arena Multimodal 11 1278.42 2026-05-06
Artificial Analysis Openness Index Openness 163 33.33 2026-05-11
Altered Riddles Reasoning 14 0.4413 2026-05-27
CAIS Text Capabilities Index Reasoning 14 31.4 2026-05-27
Context Arena Reasoning 13 64.63 2026-05-06
Context Arena Reasoning 21 51.88 2026-05-06
Global PIQA Reasoning 6 89.2% 2026-05-28
GPQA Diamond Reasoning 3 90.5% 2026-05-28
GPQA Diamond Reasoning 9 91.1% 2026-05-11
GPQA Diamond Reasoning 114 78.8% 2026-05-11
OJBench Reasoning 1 0.61 2026-05-06
CAIS Risk Index Safety 35 63.0 2026-05-27
CritPt Science 4 8% 2026-05-28
CritPt Science 23 8% 2026-05-11
CritPt Science 75 1.4% 2026-05-11
DeepSearchQA Search 3 0.83 2026-05-06
WideSearch Search 1 0.81 2026-05-06
SWE-bench Multilingual Software Engineering 3 76.7% 2026-05-28
SWE-bench Pro Software Engineering 2 59.5% 2026-05-28
SWE-bench Verified Software Engineering 4 80.2% 2026-05-28
SpreadsheetBench Spreadsheets 5 84.5% 2026-05-28
LiveSQLBench Text to SQL 5 36.43 2026-05-06
BFCL-V4 Tool Use 3 71.3% 2026-05-28
WMT24++ Translation 6 81.6% 2026-05-28
CAIS Vision Capabilities Index Vision 4 61.2 2026-05-27