Vals Multimodal Index
Benchmark consisting of a weighted performance across finance, coding, and education tasks. Showing the potential impact that LLM's can have on the economy.
16rows
scoreprimary metric
2026-05-28sampled
Metadata
Metrics
Score, Std. error (lower is better), Latency (lower is better), Cost per test (lower is better)
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Claude Opus 4.8 | 70.712% | Claude Opus 4.8 anthropic-claude-opus-4.8 | Imported | 2026-05-28 |
| 2 | GPT 5.5 | 67.768% | GPT-5.5 openai-gpt-5.5 | Imported | 2026-05-28 |
| 3 | Claude Opus 4.7 | 67.361% | Claude Opus 4.7 anthropic-claude-opus-4.7 | Imported | 2026-05-28 |
| 4 | Gemini 3.5 Flash | 62.291% | Gemini 3.5 Flash google-gemini-3.5-flash | Imported | 2026-05-28 |
| 5 | Claude Sonnet 4.6 | 60.783% | Claude Sonnet 4.6 anthropic-claude-sonnet-4.6 | Imported | 2026-05-28 |
| 6 | Kimi K2.6 Thinking | 56.788% | MoonshotAI: Kimi K2.6 moonshotai-kimi-k2.6 | Imported | 2026-05-28 |
| 7 | Gemini 3.1 Pro Preview | 55.749% | Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview | Imported | 2026-05-28 |
| 8 | GPT 5.4 Mini 2026-03-17 | 53.298% | GPT-5.4 Mini openai-gpt-5.4-mini | Imported | 2026-05-28 |
| 9 | Gemini 3 Flash Preview | 51.975% | Gemini 3 Flash Preview google-gemini-3-flash-preview | Imported | 2026-05-28 |
| 10 | Qwen 3.6 Plus | 50.737% | Qwen3.6 Plus qwen-qwen3.6-plus | Imported | 2026-05-28 |
| 11 | GPT 5.4 Nano 2026-03-17 | 47.484% | GPT-5.4 Nano openai-gpt-5.4-nano | Imported | 2026-05-28 |
| 12 | Grok 4.3 | 43.435% | Grok 4.3 x-ai-grok-4.3 | Imported | 2026-05-28 |
| 13 | Claude Haiku 4.5 20251001 Thinking | 42.352% | Claude Haiku 4.5 anthropic-claude-haiku-4.5 | Imported | 2026-05-28 |
| 14 | Gemini 3.1 Flash Lite Preview | 40.466% | Gemini 3.1 Flash Lite Preview google-gemini-3.1-flash-lite-preview | Imported | 2026-05-28 |
| 15 | Grok 4.20 0309 Reasoning | 38.704% | Grok 4.20 x-ai-grok-4.20 | Imported | 2026-05-28 |
| 16 | Command A Plus 05 2026 | 27.186% | — | Imported | 2026-05-28 |
No matching rows.