MMEB
Massive Multimodal Embedding Benchmark leaderboard evaluating embedding models across image classification, visual question answering, image retrieval, and visual grounding tasks.
37rows
v1_overallprimary metric
2026-05-06sampled
Metadata
Metrics
V1 Overall, Image Classification, Image VQA, Image Retrieval, Image Visual Grounding
| Rank | Subject | V1 Overall | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | QQMM-embed | 72.17 | — | Imported | 2026-05-06 |
| 2 | B3_Qwen2_7B | 72 | — | Imported | 2026-05-06 |
| 3 | UniME(LLaVA-OneVision-7B-LoRA-Res336) | 70.70 | — | Imported | 2026-05-06 |
| 4 | UNITE-Instruct-7B | 70.30 | — | Imported | 2026-05-06 |
| 5 | LLaVE-7B | 70.30 | — | Imported | 2026-05-06 |
| 6 | interestFM-UIR-CAFe-7B | 69.80 | — | Imported | 2026-05-06 |
| 7 | mmE5-mllama-11b-instruct | 69.80 | — | Imported | 2026-05-06 |
| 8 | BGE-VL-v1.5 (FT; LlaVA-1.6-Mistral) | 69.40 | — | Imported | 2026-05-06 |
| 9 | B3_Qwen2_2B | 68.10 | — | Imported | 2026-05-06 |
| 10 | UniME(LLaVA-1.6-7B-LoRA-LowRes) | 66.60 | — | Imported | 2026-05-06 |
| 11 | VLM2Vec (Qwen2-VL-7B-LoRA-HighRes) | 65.80 | — | Imported | 2026-05-06 |
| 12 | LLaVE-2B | 65.20 | — | Imported | 2026-05-06 |
| 13 | UniME(Phi-3.5-V-LoRA) | 64.20 | — | Imported | 2026-05-06 |
| 14 | MMRet-MLLM (FT) | 64.10 | — | Imported | 2026-05-06 |
| 15 | UNITE-Instruct-2B | 63.30 | — | Imported | 2026-05-06 |
| 16 | VLM2Vec (LLaVA-1.6-LoRA-HighRes) | 62.90 | — | Imported | 2026-05-06 |
| 17 | BGE-VL-v1.5 (zeroshot; LlaVA-1.6-Mistral) | 60.10 | — | Imported | 2026-05-06 |
| 18 | VLM2Vec (Phi-3.5-V-LoRA) | 60.10 | — | Imported | 2026-05-06 |
| 19 | interestFM-UIR-CAFe-0.5B | 59.60 | — | Imported | 2026-05-06 |
| 20 | VLM2Vec (Qwen2-VL-2B-LoRA-HighRes) | 59.30 | — | Imported | 2026-05-06 |
| 21 | LLaVE-0.5B | 59.10 | — | Imported | 2026-05-06 |
| 22 | mmE5 (w/ 560K synthetic data) | 58.60 | — | Imported | 2026-05-06 |
| 23 | VLM2Vec (Phi-3.5-V-FT) | 55.90 | — | Imported | 2026-05-06 |
| 24 | gme-Qwen2-VL-2B-Instruct | 55.80 | — | Imported | 2026-05-06 |
| 25 | VLM2Vec (LLaVA-1.6-LoRA-LowRes) | 55 | — | Imported | 2026-05-06 |
| 26 | MM-Embed | 50 | — | Imported | 2026-05-06 |
| 27 | OpenCLIP-FT | 47.20 | — | Imported | 2026-05-06 |
| 28 | CLIP-FT | 45.40 | — | Imported | 2026-05-06 |
| 29 | UniIR (CLIP_SF) | 44.70 | — | Imported | 2026-05-06 |
| 30 | MMRet-MLLM (LLaVA-1.6) | 44 | — | Imported | 2026-05-06 |
| 31 | UniIR (BLIP_FF) | 42.80 | — | Imported | 2026-05-06 |
| 32 | open_clip-ViT-L/14 | 39.70 | — | Imported | 2026-05-06 |
| 33 | clip-vit-large-patch14 | 37.80 | — | Imported | 2026-05-06 |
| 34 | siglip-base-patch16-224 | 34.80 | — | Imported | 2026-05-06 |
| 35 | Magiclens | 27.80 | — | Imported | 2026-05-06 |
| 36 | blip2-opt-2.7b | 25.20 | — | Imported | 2026-05-06 |
| 37 | e5-v | 13.30 | — | Imported | 2026-05-06 |
No matching rows.