GeoCode Leaderboard
Leaderboard for large language models on geospatial code generation using AutoGEEval and AutoGEEval++ pass@k metrics on Google Earth Engine tasks.
28rows
autogeevalpp_pass1primary metric
2026-05-28sampled
Metadata
Metrics
AutoGEEval Pass@1, AutoGEEval Pass@3, AutoGEEval Pass@5, AutoGEEval++ Pass@1, AutoGEEval++ Pass@3, AutoGEEval++ Pass@5
| Rank | Subject | AutoGEEval++ Pass@1 | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Gemini2.5-Flash-0520 | 71.87% pass@1 | — | Imported | 2026-05-28 |
| 2 | DeepSeek-V3-1226 | 71.67% pass@1 | — | Imported | 2026-05-28 |
| 3 | GPT-4.1 | 70.93% pass@1 | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-28 |
| 4 | Claude3.7-Sonnet | 70.35% pass@1 | Claude 3.7 Sonnet anthropic-claude-3.7-sonnet | Imported | 2026-05-28 |
| 5 | DeepSeek-V3-0324 | 70.25% pass@1 | DeepSeek V3 0324 deepseek-deepseek-chat-v3-0324 | Imported | 2026-05-28 |
| 6 | o4-minii-medium | 67.34% pass@1 | — | Imported | 2026-05-28 |
| 7 | Qwen2.5-Coder-32B | 66.77% pass@1 | — | Imported | 2026-05-28 |
| 8 | GPT-4.1-mini | 66.56% pass@1 | GPT-4.1 Mini openai-gpt-4.1-mini | Imported | 2026-05-28 |
| 9 | Qwen-2.5-32B | 65.45% pass@1 | — | Imported | 2026-05-28 |
| 10 | Gemini-2.0-pro | 65.36% pass@1 | — | Imported | 2026-05-28 |
| 11 | QwQ-32B | 60.69% pass@1 | — | Imported | 2026-05-28 |
| 12 | Qwen-3-32B | 60.67% pass@1 | Qwen3 32B qwen-qwen3-32b | Imported | 2026-05-28 |
| 13 | GeoCode-GPT-7B | 60.48% pass@1 | — | Imported | 2026-05-28 |
| 14 | DeepSeek-R1-0120 | 60.42% pass@1 | — | Imported | 2026-05-28 |
| 15 | GPT-4o | 59.02% pass@1 | GPT-4o openai-gpt-4o | Imported | 2026-05-28 |
| 16 | o3-mini-medium | 56.98% pass@1 | — | Imported | 2026-05-28 |
| 17 | Qwen-3-32B-Thinking | 56.31% pass@1 | — | Imported | 2026-05-28 |
| 18 | GPT-4o-mini | 55.02% pass@1 | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-28 |
| 19 | Qwen2.5-Coder-7B | 50.51% pass@1 | — | Imported | 2026-05-28 |
| 20 | Qwen-2.5-7B | 49.58% pass@1 | — | Imported | 2026-05-28 |
| 21 | Qwen-3-8B | 48.35% pass@1 | Qwen3 8B qwen-qwen3-8b | Imported | 2026-05-28 |
| 22 | Qwen2.5-Coder-3B | 45.05% pass@1 | — | Imported | 2026-05-28 |
| 23 | Code-Llama-7B | 44.80% pass@1 | — | Imported | 2026-05-28 |
| 24 | Qwen-3-8B-Thinking | 38.93% pass@1 | — | Imported | 2026-05-28 |
| 25 | DeepSeek-Coder-V2-16B | 38.80% pass@1 | — | Imported | 2026-05-28 |
| 26 | Qwen-2.5-3B | 36.26% pass@1 | — | Imported | 2026-05-28 |
| 27 | Qwen-3-4B | 34.03% pass@1 | — | Imported | 2026-05-28 |
| 28 | Qwen-3-4B-Thinking | 33.13% pass@1 | — | Imported | 2026-05-28 |
No matching rows.