MoonshotAI: Kimi K2 Thinking
Kimi / Moonshot AI
54scores
54benchmarks
$0.6 / $2.5 per 1M tokenscost in/out
Metadata
Kimi Closed/API
Aliases: kimi-k2-thinking, kimi-k2-thinking-20251106, moonshotai-kimi-k2-thinking, moonshotai-kimi-k2-thinking-20251106, moonshotai/kimi-k2-thinking, moonshotai/kimi-k2-thinking-20251106
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| APEX-Agents | Agentic | 35 | 11.50 | 2026-05-06 |
| Clembench Text v3.0 | Agentic | 9 | 77.79 | 2026-05-06 |
| EnterpriseOps-Gym | Agentic | 19 | 19.2% | 2026-05-05 |
| Gert Labs Rankings | Agentic | 49 | 0.37 | 2026-05-11 |
| MultiChallenge | Agentic | 12 | 55.42 | 2026-05-06 |
| Poker Agent | Agentic | 14 | 1011.634% | 2025-12-23 |
| Tau2-Bench Telecom | Agentic | 42 | 93% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 92 | 31.1% | 2026-05-11 |
| VitaBench | Agentic | 13 | 12.80 | 2026-05-06 |
| OpenUGI | Alignment | 63 | 52.68 | 2026-05-06 |
| ALE-Bench | Coding | 62 | 597.50 | 2026-05-06 |
| Arena AI Code | Coding | 52 | 1329 | 2026-05-06 |
| LiveCodeBench | Coding | 73 | 63.145% | 2026-05-28 |
| Multi-SWE-Bench | Coding | 4 | 0.42 | 2026-05-06 |
| SciCode | Coding | 66 | 42.4% | 2026-05-11 |
| SWE-bench Verified | Coding | 43 | 60.2% | 2026-05-28 |
| Terminal-Bench 2.0 | Coding | 37 | 37.079% | 2026-05-28 |
| CorpFin v2 | Finance | 46 | 60.567% | 2026-05-28 |
| Finance Agent v1.1 | Finance | 42 | 36.647% | 2026-05-04 |
| PRBench Finance | Finance | 12 | 43.41 | 2026-05-06 |
| QuantSightBench | Finance | 8 | 0.6579 coverage | 2026-05-28 |
| TaxEval v2 | Finance | 50 | 71.709% | 2026-05-28 |
| MMLU-Redux | General Knowledge | 3 | 0.94 | 2026-05-06 |
| WeirdML | Generalization | 10 | 42.79 | 2026-05-06 |
| MedQA | Healthcare | 29 | 92.592% | 2026-04-16 |
| Omi SOAP Note Safety Benchmark | Healthcare | 4 | 4.55 | 2026-04-21 |
| Artificial Analysis Intelligence Index | Intelligence | 71 | 40.89 | 2026-05-11 |
| GPQA Diamond | Intelligence | 46 | 78.536% | 2026-05-28 |
| Humanity's Last Exam | Intelligence | 59 | 22.3% | 2026-05-11 |
| MMLU Pro | Intelligence | 56 | 81.068% | 2026-05-28 |
| MMLU-Pro | Intelligence | 36 | 84.8% | 2026-05-11 |
| CaseLaw v2 | Legal | 11 | 65.702% | 2026-05-04 |
| LegalBench | Legal | 57 | 80.201% | 2026-05-28 |
| Professional Reasoning Bench - Legal | Legal | 14 | 40.90 | 2026-05-06 |
| AIME | Math | 36 | 85.417% | 2026-04-16 |
| AIME 2025 | Math | 11 | 94.7% | 2026-05-11 |
| MGSM | Math | 42 | 90.146% | 2026-01-09 |
| HMMT 2025 | Mathematics | 4 | 0.97 | 2026-05-06 |
| IMO-AnswerBench | Mathematics | 12 | 0.79 | 2026-05-06 |
| OTIS Mock AIME 2024-2025 | Mathematics | 10 | 83.06 | 2026-05-06 |
| MEDIC Benchmark | Medical | 19 | 74.36 average normalized public table score | 2026-05-27 |
| Artificial Analysis Openness Index | Openness | 175 | 27.78 | 2026-05-11 |
| CAIS Text Capabilities Index | Reasoning | 25 | 18.1 | 2026-05-27 |
| GPQA Diamond | Reasoning | 69 | 83.8% | 2026-05-11 |
| OJBench | Reasoning | 2 | 0.49 | 2026-05-06 |
| SimpleBench | Reasoning | 18 | 26.30 | 2026-05-06 |
| CAIS Risk Index | Safety | 25 | 57.4 | 2026-05-27 |
| CritPt | Science | 55 | 2.6% | 2026-05-11 |
| GSO-Bench | Science | 5 | 4.90 | 2026-05-06 |
| BrowseComp-zh | Search | 8 | 0.62 | 2026-05-06 |
| FRAMES | Search | 1 | 0.87 | 2026-05-06 |
| Seal-0 | Search | 2 | 0.56 | 2026-05-06 |
| Lech Mazur Writing | Writing | 3 | 8.69 | 2026-05-06 |
| WritingBench | Writing | 15 | 0.74 | 2026-05-06 |
No matching rows.