R1 0528
DeepSeek / DeepSeek
32scores
31benchmarks
$0.5 / $2.15 per 1M tokenscost in/out
Metadata
DeepSeek Open source
Aliases: deepseek-deepseek-r1-0528, deepseek-r1-0528, deepseek/deepseek-r1-0528
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| ARC-AGI-1 | Agentic | 109 | 21.21 | 2026-05-05 |
| ARC-AGI-2 | Agentic | 115 | 1.12 | 2026-05-05 |
| AgentBench FC | Agents | 14 | 49.30 | 2026-05-06 |
| ALE-Bench | Coding | 39 | 804.13 | 2026-05-06 |
| Codeforces | Coding | 14 | 0.6433 | 2026-05-28 |
| LiveCodeBench | Coding | 5 | 73.10 | 2026-05-06 |
| TuRTLe Code Completion (Icarus Verilog) | Coding | 4 | 78.86 | 2026-05-06 |
| TuRTLe Code Completion (Verilator) | Coding | 4 | 78.08 | 2026-05-06 |
| TuRTLe Module Completion (NotSoTiny) | Coding | 5 | 20.73 | 2026-05-06 |
| TuRTLe Spec-to-RTL (Icarus Verilog) | Coding | 4 | 76.79 | 2026-05-06 |
| TuRTLe Spec-to-RTL (Verilator) | Coding | 4 | 75.83 | 2026-05-06 |
| SecCodeBench | Cybersecurity | 18 | 54.06% | 2026-05-28 |
| MMTU | Data | 8 | 0.58 | 2026-05-06 |
| GSMA Open Telco Leaderboard | Domain | 62 | 43.37 | 2026-05-06 |
| kluster.ai LLM Hallucination Detection Leaderboard | Factuality | 3 | 98.48 | 2026-05-06 |
| FinanceArena | Finance | 10 | 42.9 | 2026-05-27 |
| PRBench Finance | Finance | 26 | 32.67 | 2026-05-06 |
| MMLU-Redux | General Knowledge | 8 | 0.93 | 2026-05-06 |
| HELM Safety | Generalization | 39 | 0.894417 | 2026-05-28 |
| LongBench v2 | Generalization | 7 | 56.7% | 2026-05-27 |
| HUMAINE | Human Preference | 1 | 3.79 | 2026-05-06 |
| Professional Reasoning Bench - Legal | Legal | 22 | 36.61 | 2026-05-06 |
| AIME 2024 | Math | 3 | 91.4 | 2026-05-27 |
| IneqMath | Math | 18 | 9.50 | 2026-05-06 |
| IneqMath | Math | 32 | 4.50 | 2026-05-06 |
| HMMT 2025 | Mathematics | 24 | 0.79 | 2026-05-06 |
| LanguageBench | Multilingual | 31 | 0.12 | 2026-05-06 |
| Humanity's Last Exam (Text Only) | Reasoning | 20 | 14.04 | 2026-05-06 |
| MultiNRC | Reasoning | 21 | 27.58 | 2026-05-06 |
| LiveSecBench | Safety | 20 | 55.22 | 2026-05-27 |
| BrowseComp-zh | Search | 13 | 0.36 | 2026-05-06 |
| IDE-Bench | Software Engineering | 11 | 20 | 2026-05-27 |
No matching rows.