CHOICE
Remote-sensing vision-language benchmark for perception and reasoning over Earth-observation imagery across multiple task dimensions.
24rows
overall_scoreprimary metric
2026-05-27sampled
Metadata
Metrics
Overall Score, Perception Score, Reasoning Score, ILC, SII, CID, AttR, AssR, CSR
| Rank | Subject | Overall Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Qwen2-VL-72B | 0.7378 | — | Imported | 2026-05-27 |
| 2 | InternVL2-40B | 0.7263 | — | Imported | 2026-05-27 |
| 3 | Ovis1.6-Gemma2-9B | 0.6998 | — | Imported | 2026-05-27 |
| 4 | Gemini-1.5-Pro | 0.6942 | — | Imported | 2026-05-27 |
| 5 | Qwen2-VL-7B | 0.6922 | — | Imported | 2026-05-27 |
| 6 | InternVL2-26B | 0.6858 | — | Imported | 2026-05-27 |
| 7 | InternVL2-8B | 0.6772 | — | Imported | 2026-05-27 |
| 8 | GLM-4V-9B | 0.6435 | — | Imported | 2026-05-27 |
| 9 | GPT-4o-2024-11-20 | 0.6275 | GPT-4o openai-gpt-4o | Imported | 2026-05-27 |
| 10 | Molmo-7B-D | 0.6215 | — | Imported | 2026-05-27 |
| 11 | MiniCPM-V-2.5 | 0.6163 | — | Imported | 2026-05-27 |
| 12 | GPT-4o-mini | 0.6133 | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-27 |
| 13 | DeepSeek-VL-7B | 0.6102 | — | Imported | 2026-05-27 |
| 14 | LLaVA-1.6-13B | 0.5907 | — | Imported | 2026-05-27 |
| 15 | Llama3.2-11B | 0.5672 | — | Imported | 2026-05-27 |
| 16 | mPLUG-Owl3-7B | 0.5648 | — | Imported | 2026-05-27 |
| 17 | VHM | 0.5623 | — | Imported | 2026-05-27 |
| 18 | LLaVA-1.6-7B | 0.5592 | — | Imported | 2026-05-27 |
| 19 | Phi3-Vision | 0.5293 | — | Imported | 2026-05-27 |
| 20 | GeoChat | 0.5067 | — | Imported | 2026-05-27 |
| 21 | LHRS-Bot-nova | 0.4930 | — | Imported | 2026-05-27 |
| 22 | RemoteCLIP | 0.4868 | — | Imported | 2026-05-27 |
| 23 | GeoRSCLIP | 0.4532 | — | Imported | 2026-05-27 |
| 24 | LHRS-Bot | 0.4160 | — | Imported | 2026-05-27 |
No matching rows.