Video-MME
Video-MME evaluates multimodal video understanding across short, medium, and long videos, with and without subtitle context.
51rows
overall_with_subtitlesprimary metric
2026-05-06sampled
Metadata
Metrics
Overall w/o subs, Overall w subs, Short Video w/o subs, Short Video w subs, Medium Video w/o subs, Medium Video w subs, Long Video w/o subs, Long Video w subs
| Rank | Subject | Overall w subs | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | video-SALMONN 2+ | 81.60 | — | Imported | 2026-05-06 |
| 2 | Gemini 1.5 Pro | 81.30 | — | Imported | 2026-05-06 |
| 3 | AdaReTaKe | 79.60 | — | Imported | 2026-05-06 |
| 4 | JT-VL-Chat | 79.10 | — | Imported | 2026-05-06 |
| 5 | Qwen2-VL | 77.80 | — | Imported | 2026-05-06 |
| 6 | GPT-4o | 77.20 | GPT-4o openai-gpt-4o | Imported | 2026-05-06 |
| 7 | LLaVA-Video | 76.90 | — | Imported | 2026-05-06 |
| 8 | Gemini 1.5 Flash | 75 | — | Imported | 2026-05-06 |
| 9 | Oryx-1.5 | 74.90 | — | Imported | 2026-05-06 |
| 10 | InternVL2.5 | 74 | — | Imported | 2026-05-06 |
| 11 | Keye-VL-1.5 | 74 | — | Imported | 2026-05-06 |
| 12 | TSPO | 73.70 | — | Imported | 2026-05-06 |
| 13 | ViLAMP | 73.50 | — | Imported | 2026-05-06 |
| 14 | Aria | 72.10 | — | Imported | 2026-05-06 |
| 15 | Long-VITA | 72 | — | Imported | 2026-05-06 |
| 16 | LinVT | 71.70 | — | Imported | 2026-05-06 |
| 17 | TPO | 71.50 | — | Imported | 2026-05-06 |
| 18 | LiveCC | 70.30 | — | Imported | 2026-05-06 |
| 19 | VideoLLaMA 3 | 70.30 | — | Imported | 2026-05-06 |
| 20 | NVILA | 70 | — | Imported | 2026-05-06 |
| 21 | QuoTA | 70 | — | Imported | 2026-05-06 |
| 22 | VideoChat-Flash | 69.70 | — | Imported | 2026-05-06 |
| 23 | LLaVA-OneVision | 69.60 | — | Imported | 2026-05-06 |
| 24 | GPT-4o mini | 68.90 | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-06 |
| 25 | ReTaKe | 68.90 | — | Imported | 2026-05-06 |
| 26 | ByteVideoLLM | 68.80 | — | Imported | 2026-05-06 |
| 27 | mPLUG-Owl3 | 68.10 | — | Imported | 2026-05-06 |
| 28 | MiniCPM-o 2.6 | 67.90 | — | Imported | 2026-05-06 |
| 29 | VideoLLaMA 2 | 64.70 | — | Imported | 2026-05-06 |
| 30 | MiniCPM-V 2.6 | 63.70 | — | Imported | 2026-05-06 |
| 31 | GPT-4V | 63.30 | GPT-4 openai-gpt-4 | Imported | 2026-05-06 |
| 32 | Claude 3.5 Sonnet | 62.90 | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-06 |
| 33 | TimeMarker | 62.80 | — | Imported | 2026-05-06 |
| 34 | InternVL2 | 62.40 | — | Imported | 2026-05-06 |
| 35 | Video-XL | 61 | — | Imported | 2026-05-06 |
| 36 | VITA | 59.20 | — | Imported | 2026-05-06 |
| 37 | VITA 1.5 | 58.70 | — | Imported | 2026-05-06 |
| 38 | Kangaroo | 57.60 | — | Imported | 2026-05-06 |
| 39 | Video-CCAM | 57.40 | — | Imported | 2026-05-06 |
| 40 | Long-LLaVA | 57.10 | — | Imported | 2026-05-06 |
| 41 | LongVA | 54.30 | — | Imported | 2026-05-06 |
| 42 | InternVL-Chat-V1.5 | 52.40 | — | Imported | 2026-05-06 |
| 43 | Qwen-VL-Max | 51.20 | Qwen VL Max qwen-qwen-vl-max | Imported | 2026-05-06 |
| 44 | ShareGemini | 47.90 | — | Imported | 2026-05-06 |
| 45 | SliME | 47.20 | — | Imported | 2026-05-06 |
| 46 | Chat-UniVi-v1.5 | 45.90 | — | Imported | 2026-05-06 |
| 47 | VideoChat2-Mistral | 43.80 | — | Imported | 2026-05-06 |
| 48 | ShareGPT4Video | 43.60 | — | Imported | 2026-05-06 |
| 49 | ST-LLM | 42.30 | — | Imported | 2026-05-06 |
| 50 | Qwen-VL-Chat | 41.90 | — | Imported | 2026-05-06 |
| 51 | Video-LLaVA | 41.60 | — | Imported | 2026-05-06 |
No matching rows.