LLaVA-Bench in the Wild
LLaVA-Bench in the Wild: Evaluates multimodal understanding across image, text, chart, diagram, or cross-modal reasoning tasks.
8rows
llava_bench_wildprimary metric
2026-05-27sampled
Metadata
Metrics
MMMU, MathVista, VQAv2, GQA, VizWiz, SQA, TextVQA, POPE, MME, MM-Bench, MM-Bench-CN, SEED-IMG, LLaVA-Bench-Wild, MM-Vet, SEED
| Rank | Subject | LLaVA-Bench-Wild | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | LLaVA-1.6 / Hermes-Yi-34B / full_ft-1e | 89.6 | — | Imported | 2026-05-27 |
| 2 | LLaVA-1.6 / Vicuna-13B / full_ft-1e | 87.3 | — | Imported | 2026-05-27 |
| 3 | LLaVA-1.6 / Mistral-7B / full_ft-1e | 83.2 | — | Imported | 2026-05-27 |
| 4 | LLaVA-1.6 / Vicuna-7B / full_ft-1e | 81.6 | — | Imported | 2026-05-27 |
| 5 | LLaVA-1.5 / 13B / full_ft-1e | 72.5 | — | Imported | 2026-05-27 |
| 6 | LLaVA-1.5 / 13B / lora-1e | 69.5 | — | Imported | 2026-05-27 |
| 7 | LLaVA-1.5 / 7B / lora-1e | 67.9 | — | Imported | 2026-05-27 |
| 8 | LLaVA-1.5 / 7B / full_ft-1e | 65.4 | — | Imported | 2026-05-27 |
No matching rows.