OmniEarth-Bench

Holistic multimodal Earth-science benchmark across atmosphere, oceans, cryosphere, biosphere, land, and human-earth interaction tasks.

9rows
vqa_averageprimary metric
2026-05-27sampled

Metadata

Metrics

VQA Average, Cross-sphere, Atmosphere, Lithosphere, Oceansphere, Cryosphere, Biosphere, Human-activities

Latest Results

Rows parsed from the OmniEarth-Bench public VQA results table. Scores are percentage points.

Rank Subject VQA Average Model Match Provenance Sampled
1 InternVL3-72B 33.26 Imported 2026-05-27
2 InternVL3-7B 33.13 Imported 2026-05-27
3 LLaVA-Onevision-7B 31.51 Imported 2026-05-27
4 Claude-3.7-Sonnet 29.07 Claude 3.7 Sonnet
anthropic-claude-3.7-sonnet
Imported 2026-05-27
5 Gemini-2.0 28.1 Imported 2026-05-27
6 InternLM-XComposer-2.5-7B 26.09 Imported 2026-05-27
7 Qwen 2.5-VL-7B 12.39 Imported 2026-05-27
8 GPT-4o 11.15 GPT-4o
openai-gpt-4o
Imported 2026-05-27
9 Qwen 2.5-VL-72B 10.98 Qwen2.5 VL 72B Instruct
qwen-qwen2.5-vl-72b-instruct
Imported 2026-05-27