MBPP+
MBPP+ code-generation leaderboard from EvalPlus.
25rows
mbpp_plusprimary metric
2026-05-05sampled
Metadata
Metrics
MBPP+ pass@1, MBPP pass@1
| Rank | Subject | MBPP+ pass@1 | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | O1 Preview (Sept 2024) | 80.20 | o1-preview openai-o1-preview | Imported | 2026-05-05 |
| 2 | O1 Mini (Sept 2024) | 78.80 | — | Imported | 2026-05-05 |
| 3 | Qwen2.5-Coder-32B-Instruct | 77 | Qwen2.5 Coder 32B Instruct qwen-qwen-2.5-coder-32b-instruct | Imported | 2026-05-05 |
| 4 | DeepSeek-Coder-V2-Instruct | 75.10 | — | Imported | 2026-05-05 |
| 5 | Gemini 1.5 Pro 002 | 74.60 | — | Imported | 2026-05-05 |
| 6 | Claude Sonnet 3.5 (June 2024) | 74.30 | — | Imported | 2026-05-05 |
| 7 | DeepSeek-V2.5 (Nov 2024) | 74.10 | — | Imported | 2026-05-05 |
| 8 | GPT-4-Turbo (Nov 2023) | 73.30 | GPT-4 Turbo openai-gpt-4-turbo | Imported | 2026-05-05 |
| 9 | claude-3-opus (Mar 2024) | 73.30 | — | Imported | 2026-05-05 |
| 10 | DeepSeek-V3 (Nov 2024) | 73 | DeepSeek V3 deepseek-deepseek-chat | Imported | 2026-05-05 |
| 11 | GPT 4o (Aug 2024) | 72.20 | GPT-4o openai-gpt-4o | Imported | 2026-05-05 |
| 12 | GPT 4o Mini (July 2024) | 72.20 | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-05 |
| 13 | OpenCoder-8B-Instruct | 71.40 | — | Imported | 2026-05-05 |
| 14 | DeepSeek-Coder-33B-instruct | 70.10 | — | Imported | 2026-05-05 |
| 15 | GPT-3.5-Turbo (Nov 2023) | 69.70 | GPT-3.5 Turbo openai-gpt-3.5-turbo | Imported | 2026-05-05 |
| 16 | Artigenz-Coder-DS-6.7B | 69.60 | — | Imported | 2026-05-05 |
| 17 | claude-3-sonnet (Mar 2024) | 69.30 | — | Imported | 2026-05-05 |
| 18 | CodeQwen1.5-7B-Chat | 69 | — | Imported | 2026-05-05 |
| 19 | Llama3-70B-instruct | 69 | Llama 3 70B Instruct meta-llama-llama-3-70b-instruct | Imported | 2026-05-05 |
| 20 | Magicoder-S-DS-6.7B | 69 | — | Imported | 2026-05-05 |
| 21 | claude-3-haiku (Mar 2024) | 68.80 | Claude 3 Haiku anthropic-claude-3-haiku | Imported | 2026-05-05 |
| 22 | OpenCodeInterpreter-DS-33B | 68.50 | — | Imported | 2026-05-05 |
| 23 | Gemini 1.5 Flash 002 | 67.50 | — | Imported | 2026-05-05 |
| 24 | WhiteRabbitNeo-33B-v1 | 66.90 | — | Imported | 2026-05-05 |
| 25 | OpenCodeInterpreter-DS-6.7B | 66.40 | — | Imported | 2026-05-05 |
No matching rows.