Natural2Code
NaturalCodeBench (NCB) is a challenging code benchmark designed to mirror the complexity and variety of real-world coding tasks. It comprises 402 high-quality problems in Python and Java, selected from natural user queries from online coding services, covering 6 different domains.
8rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Normalized Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Gemini 2.0 Flash | 0.93 | Gemini 2.0 Flash google-gemini-2.0-flash | Self-reported | 2026-05-06 |
| 2 | Gemini 1.5 Pro | 0.85 | — | Self-reported | 2026-05-06 |
| 3 | Gemma 3 27B | 0.84 | Gemma 3 27B google-gemma-3-27b-it | Self-reported | 2026-05-06 |
| 4 | Gemma 3 12B | 0.81 | Gemma 3 12B google-gemma-3-12b-it | Self-reported | 2026-05-06 |
| 5 | Gemini 1.5 Flash | 0.80 | — | Self-reported | 2026-05-06 |
| 6 | Gemini 1.5 Flash 8B | 0.76 | — | Self-reported | 2026-05-06 |
| 7 | Gemma 3 4B | 0.70 | Gemma 3 4B google-gemma-3-4b-it | Self-reported | 2026-05-06 |
| 8 | Gemma 3 1B | 0.56 | — | Self-reported | 2026-05-06 |
No matching rows.