McEval
McEval: Measures model capability on programming, code generation, code repair, or repository-level software tasks.
22rows
avgprimary metric
2026-05-27sampled
Metadata
Metrics
AVG, AWK, C, C++, C#, Clisp, Coffee, Dart, Elisp, Elixir, Erlang, Fortran, F#, Go, Groovy, Haskell, Html, Java, JS, Json, Julia, Kotlin, Lua, MD, Pascal, Perl, PHP, Power, Python, R, Racket, Ruby, Rust, Scala, Scheme, Shell, Swift, Tcl, TS, VB, VimL
| Rank | Subject | AVG | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-4o-240513 | 65.2% | GPT-4o openai-gpt-4o | Imported | 2026-05-27 |
| 2 | GPT-4-Turbo-231106 | 63.4% | GPT-4 Turbo openai-gpt-4-turbo | Imported | 2026-05-27 |
| 3 | DeepSeek-Coder-33b-instruct | 54.3% | — | Imported | 2026-05-27 |
| 4 | GPT-3.5-Turbo-240125 | 52.6% | GPT-3.5 Turbo openai-gpt-3.5-turbo | Imported | 2026-05-27 |
| 5 | Codestral-22B-v0.1 | 50.5% | — | Imported | 2026-05-27 |
| 6 | Magicoder-S-DS-6.7B | 48.6% | — | Imported | 2026-05-27 |
| 7 | Yi-Large-Turbo | 46.6% | — | Imported | 2026-05-27 |
| 8 | DeepSeek-Coder-1.5-7b-instruct | 46% | — | Imported | 2026-05-27 |
| 9 | OpenCodeInterpreter-DS-6.7B | 46% | — | Imported | 2026-05-27 |
| 10 | CodeQwen-1.5-7b | 45.5% | — | Imported | 2026-05-27 |
| 11 | Nxcode-CQ-7B-orpo | 44.7% | — | Imported | 2026-05-27 |
| 12 | WizardCoder-Python-34B | 36.5% | — | Imported | 2026-05-27 |
| 13 | Llama-3-8B-Instruct | 36% | Llama 3 8B Instruct meta-llama-llama-3-8b-instruct | Imported | 2026-05-27 |
| 14 | Qwen1.5-72B-Chat | 35.8% | — | Imported | 2026-05-27 |
| 15 | Phi-3-medium-4k-instruct | 35.2% | — | Imported | 2026-05-27 |
| 16 | Codegemma-7b-it | 30.7% | — | Imported | 2026-05-27 |
| 17 | CodeLlama-34b-Instruct | 29.1% | — | Imported | 2026-05-27 |
| 18 | WizardCoder-15B-V1.0 | 28% | — | Imported | 2026-05-27 |
| 19 | CodeLlama-13b-Instruct | 27.7% | — | Imported | 2026-05-27 |
| 20 | CodeLlama-7b-Instruct | 24.6% | — | Imported | 2026-05-27 |
| 21 | OCTOCODER | 23.3% | — | Imported | 2026-05-27 |
| 22 | Codeshell-7b-chat | 23% | — | Imported | 2026-05-27 |
No matching rows.