SEED-Bench
SEED-Bench: Evaluates multimodal understanding across image, text, chart, diagram, or cross-modal reasoning tasks.
63rows
avg_allprimary metric
2026-05-06sampled
Metadata
Metrics
Avg. All, Avg. Img, Avg. Video, Scene Understanding, Instance Identity, Instance Attribute, Instance Location, Instance Counting, Spatial Relation, Instance Interaction, Visual Reasoning, Text Recognition, Action Recognition, Action Prediction, Procedure Understanding
| Rank | Subject | Avg. All | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | InternVL-Chat-V1.2-Plus | 70.40 | — | Imported | 2026-05-06 |
| 2 | Weitu-VL-1.0 | 69.20 | — | Imported | 2026-05-06 |
| 3 | SPHINXv2-1k | 67.50 | — | Imported | 2026-05-06 |
| 4 | GPT-4V | 67.30 | GPT-4 openai-gpt-4 | Imported | 2026-05-06 |
| 5 | Qwen-VL-plus | 66.80 | Qwen VL Plus qwen-qwen-vl-plus | Imported | 2026-05-06 |
| 6 | SPHINXv1-1k | 63.90 | — | Imported | 2026-05-06 |
| 7 | [llava-v1.5-7b-finetune]() | 62.80 | — | Imported | 2026-05-06 |
| 8 | LLaVA-v1.5-LoRA | 62.80 | — | Imported | 2026-05-06 |
| 9 | LLaVA-v1.5-13B-LoRA | 62.40 | — | Imported | 2026-05-06 |
| 10 | InfMLLM-13B | 62.30 | — | Imported | 2026-05-06 |
| 11 | LLaVA-1.5 | 61.60 | — | Imported | 2026-05-06 |
| 12 | [llava-v1.5-7b-dsn-ft]() | 61.50 | — | Imported | 2026-05-06 |
| 13 | [llava-v1.5-7b-910b]() | 61 | — | Imported | 2026-05-06 |
| 14 | Unified-IO-2 7B (2.5M) | 60.50 | — | Imported | 2026-05-06 |
| 15 | Unified-IO-2 7B | 60.40 | — | Imported | 2026-05-06 |
| 16 | Unified-IO-2 3B (3M) | 60.20 | — | Imported | 2026-05-06 |
| 17 | LLaMA-VID-7B | 59.90 | — | Imported | 2026-05-06 |
| 18 | Unified-IO-2 3B | 58.70 | — | Imported | 2026-05-06 |
| 19 | Qwen-VL-Chat | 58.20 | — | Imported | 2026-05-06 |
| 20 | mPLUG-Owl2 | 57.80 | — | Imported | 2026-05-06 |
| 21 | [hh_resampler_v2-dsn]() | 57.30 | — | Imported | 2026-05-06 |
| 22 | [hh_v3_dsn]() | 57 | — | Imported | 2026-05-06 |
| 23 | [hh_resampler_llava]() | 56.60 | — | Imported | 2026-05-06 |
| 24 | Qwen-VL | 56.30 | — | Imported | 2026-05-06 |
| 25 | InstructBLIP-Vicuna | 53.40 | — | Imported | 2026-05-06 |
| 26 | InstructBLIP | 52.70 | — | Imported | 2026-05-06 |
| 27 | Kosmos-2 | 50 | — | Imported | 2026-05-06 |
| 28 | Unified-IO-2 1B | 49.60 | — | Imported | 2026-05-06 |
| 29 | SEED-LLaMA | 48.90 | — | Imported | 2026-05-06 |
| 30 | BLIP-2 | 46.40 | — | Imported | 2026-05-06 |
| 31 | MiniGPT-4 | 42.80 | — | Imported | 2026-05-06 |
| 32 | Claude-3-Opus | 40.90 | — | Imported | 2026-05-06 |
| 33 | OpenFlamingo | 40.90 | — | Imported | 2026-05-06 |
| 34 | Otter | 39.70 | — | Imported | 2026-05-06 |
| 35 | VPGTrans | 39.10 | — | Imported | 2026-05-06 |
| 36 | VideoChat | 37.60 | — | Imported | 2026-05-06 |
| 37 | mPLUG-Owl | 34 | — | Imported | 2026-05-06 |
| 38 | Otter | 33.90 | — | Imported | 2026-05-06 |
| 39 | GVT | 33.50 | — | Imported | 2026-05-06 |
| 40 | MultiModal-GPT | 33.20 | — | Imported | 2026-05-06 |
| 41 | OpenFlamingo | 33.10 | — | Imported | 2026-05-06 |
| 42 | LLaMA-AdapterV2 | 32.70 | — | Imported | 2026-05-06 |
| 43 | Video-ChatGPT | 31.20 | — | Imported | 2026-05-06 |
| 44 | Valley | 30.30 | — | Imported | 2026-05-06 |
| 45 | Vicuna | 28.50 | — | Imported | 2026-05-06 |
| 46 | Flan-T5 | 27.70 | — | Imported | 2026-05-06 |
| 47 | LLaMA | 26.80 | — | Imported | 2026-05-06 |
| 48 | ALIP_llava | 0 | — | Imported | 2026-05-06 |
| 49 | DreamLIP | 0 | — | Imported | 2026-05-06 |
| 50 | DreamLIP_30m | 0 | — | Imported | 2026-05-06 |
| 51 | Gemini-Pro-Vision | 0 | — | Imported | 2026-05-06 |
| 52 | Honeybee-13B | 0 | — | Imported | 2026-05-06 |
| 53 | IDEFICS-80b-instruct | 0 | — | Imported | 2026-05-06 |
| 54 | IDEFICS-9b-instruct | 0 | — | Imported | 2026-05-06 |
| 55 | InternLM-XComposer-VL | 0 | — | Imported | 2026-05-06 |
| 56 | InternLM-XComposer2-VL-7B | 0 | — | Imported | 2026-05-06 |
| 57 | LaCLIP_llava | 0 | — | Imported | 2026-05-06 |
| 58 | LLaVA-7B + detection and grounding trained | 0 | — | Imported | 2026-05-06 |
| 59 | MiniCPM-Llama3-V2.5 | 0 | — | Imported | 2026-05-06 |
| 60 | MiniCPM-V-2 | 0 | — | Imported | 2026-05-06 |
| 61 | Pink-LLaMA2 | 0 | — | Imported | 2026-05-06 |
| 62 | ShareGPT4V-13B | 0 | — | Imported | 2026-05-06 |
| 63 | ShareGPT4V-7B | 0 | — | Imported | 2026-05-06 |
No matching rows.