ClassEval
Class-level code generation benchmark evaluating class and function success under multiple prompting strategies.
33rows
pass_1_class_successprimary metric
2026-05-27sampled
Metadata
Metrics
Pass@1 Class Success, Pass@1 Class Partial Success, Pass@1 Function Success, Pass@5 Class Success, Pass@5 Function Success
| Rank | Subject | Pass@1 Class Success | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-4 (H) | 37.6 | GPT-4 openai-gpt-4 | Imported | 2026-05-27 |
| 2 | GPT-3.5 (H) | 29.6 | — | Imported | 2026-05-27 |
| 3 | GPT-4 (C) | 29.6 | GPT-4 openai-gpt-4 | Imported | 2026-05-27 |
| 4 | GPT-4 (I) | 26.2 | GPT-4 openai-gpt-4 | Imported | 2026-05-27 |
| 5 | GPT-3.5 (I) | 25.6 | — | Imported | 2026-05-27 |
| 6 | GPT-3.5 (C) | 18.2 | — | Imported | 2026-05-27 |
| 7 | WizardCoder (C) | 12.2 | — | Imported | 2026-05-27 |
| 8 | Instruct-StarCoder (H) | 10.2 | — | Imported | 2026-05-27 |
| 9 | WizardCoder (H) | 9.2 | — | Imported | 2026-05-27 |
| 10 | Instruct-StarCoder (C) | 9 | — | Imported | 2026-05-27 |
| 11 | SantaCoder (I) | 8.6 | — | Imported | 2026-05-27 |
| 12 | Instruct-StarCoder (I) | 8.4 | — | Imported | 2026-05-27 |
| 13 | Instruct-CodeGen (I) | 8.2 | — | Imported | 2026-05-27 |
| 14 | Instruct-CodeGen (H) | 7.4 | — | Imported | 2026-05-27 |
| 15 | CodeGeeX (I) | 7.2 | — | Imported | 2026-05-27 |
| 16 | Incoder (I) | 6.2 | — | Imported | 2026-05-27 |
| 17 | Instruct-CodeGen (C) | 5.8 | — | Imported | 2026-05-27 |
| 18 | WizardCoder (I) | 5.4 | — | Imported | 2026-05-27 |
| 19 | CodeGeeX (C) | 3.8 | — | Imported | 2026-05-27 |
| 20 | Incoder (C) | 3.4 | — | Imported | 2026-05-27 |
| 21 | SantaCoder (C) | 3.2 | — | Imported | 2026-05-27 |
| 22 | Vicuna (I) | 3 | — | Imported | 2026-05-27 |
| 23 | Incoder (H) | 2.6 | — | Imported | 2026-05-27 |
| 24 | PolyCoder (C) | 2.6 | — | Imported | 2026-05-27 |
| 25 | Vicuna (C) | 2.6 | — | Imported | 2026-05-27 |
| 26 | ChatGLM (C) | 1.4 | — | Imported | 2026-05-27 |
| 27 | PolyCoder (I) | 1.4 | — | Imported | 2026-05-27 |
| 28 | Vicuna (H) | 1.4 | — | Imported | 2026-05-27 |
| 29 | ChatGLM (I) | 1.2 | — | Imported | 2026-05-27 |
| 30 | ChatGLM (H) | 1 | — | Imported | 2026-05-27 |
| 31 | CodeGeeX (H) | 1 | — | Imported | 2026-05-27 |
| 32 | SantaCoder (H) | 1 | — | Imported | 2026-05-27 |
| 33 | PolyCoder (H) | 0 | — | Imported | 2026-05-27 |
No matching rows.