ENAMEL
Efficiency-aware code-generation benchmark built from HumanEval problems with expert efficient reference solutions and strong test generators, reporting eff@1 alongside pass@1.
32rows
eff_at_1primary metric
2026-05-06sampled
Metadata
Metrics
eff@1, pass@1
| Rank | Subject | eff@1 | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | HumanEval+ | 0.52 | — | Imported | 2026-05-06 |
| 2 | GPT-4 Turbo (Nov 2023) | 0.47 | GPT-4 Turbo openai-gpt-4-turbo | Imported | 2026-05-06 |
| 3 | HumanEval | 0.46 | — | Imported | 2026-05-06 |
| 4 | GPT-4 (Jun 2023) | 0.45 | GPT-4 openai-gpt-4 | Imported | 2026-05-06 |
| 5 | Llama 3 70B Instruct | 0.42 | Llama 3 70B Instruct meta-llama-llama-3-70b-instruct | Imported | 2026-05-06 |
| 6 | Mixtral 8x22B Instruct | 0.41 | Mistral: Mixtral 8x22B Instruct mistralai-mixtral-8x22b-instruct | Imported | 2026-05-06 |
| 7 | Claude 3 Opus | 0.40 | — | Imported | 2026-05-06 |
| 8 | Phind Code Llama V2 | 0.39 | — | Imported | 2026-05-06 |
| 9 | Claude 3 Haiku | 0.39 | Claude 3 Haiku anthropic-claude-3-haiku | Imported | 2026-05-06 |
| 10 | ChatGPT | 0.36 | — | Imported | 2026-05-06 |
| 11 | Claude 3 Sonnet | 0.34 | — | Imported | 2026-05-06 |
| 12 | Llama 3 8B Instruct | 0.34 | Llama 3 8B Instruct meta-llama-llama-3-8b-instruct | Imported | 2026-05-06 |
| 13 | Code Llama 34B Python | 0.27 | — | Imported | 2026-05-06 |
| 14 | Mixtral 8x7B Instruct | 0.27 | Mistral: Mixtral 8x7B Instruct mistralai-mixtral-8x7b-instruct | Imported | 2026-05-06 |
| 15 | Code Llama 70B Python | 0.26 | — | Imported | 2026-05-06 |
| 16 | Code Llama 7B Python | 0.25 | — | Imported | 2026-05-06 |
| 17 | Code Llama 13B Python | 0.22 | — | Imported | 2026-05-06 |
| 18 | StarCoder | 0.20 | — | Imported | 2026-05-06 |
| 19 | CodeGen 6B | 0.19 | — | Imported | 2026-05-06 |
| 20 | CodeGen 16B | 0.17 | — | Imported | 2026-05-06 |
| 21 | CodeT5+ 16B | 0.16 | — | Imported | 2026-05-06 |
| 22 | CodeGen 2B | 0.15 | — | Imported | 2026-05-06 |
| 23 | Mistral 7B | 0.15 | — | Imported | 2026-05-06 |
| 24 | Vicuna 13B | 0.12 | — | Imported | 2026-05-06 |
| 25 | SantaCoder | 0.10 | — | Imported | 2026-05-06 |
| 26 | Incoder 6B | 0.09 | — | Imported | 2026-05-06 |
| 27 | GPT-J | 0.08 | — | Imported | 2026-05-06 |
| 28 | Incoder 1B | 0.07 | — | Imported | 2026-05-06 |
| 29 | Vicuna 7B | 0.06 | — | Imported | 2026-05-06 |
| 30 | GPT-Neo 2B | 0.04 | — | Imported | 2026-05-06 |
| 31 | PolyCoder | 0.04 | — | Imported | 2026-05-06 |
| 32 | StableLM 7B | 0.02 | — | Imported | 2026-05-06 |
No matching rows.