LanguageBench
LanguageBench evaluates AI models across many languages and tasks, including translation, classification, multilingual Q&A, advanced Q&A, and math.
33rows
averageprimary metric
2026-05-06sampled
Metadata
Metrics
Overall, Translation (from), Translation (to), Classification, Q&A, Advanced Q&A, Math
| Rank | Subject | Overall | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | google/gemini-2.0-flash-001 | 0.69 | Gemini 2.0 Flash google-gemini-2.0-flash | Imported | 2026-05-06 |
| 2 | anthropic/claude-3.5-sonnet | 0.68 | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-06 |
| 3 | anthropic/claude-3.7-sonnet | 0.68 | Claude 3.7 Sonnet anthropic-claude-3.7-sonnet | Imported | 2026-05-06 |
| 4 | google/gemini-2.5-flash | 0.68 | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-06 |
| 5 | anthropic/claude-sonnet-4 | 0.67 | Claude Sonnet 4 anthropic-claude-sonnet-4 | Imported | 2026-05-06 |
| 6 | openai/gpt-4.1 | 0.66 | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-06 |
| 7 | google/gemini-2.0-flash-lite-001 | 0.65 | Gemini 2.0 Flash Lite google-gemini-2.0-flash-lite-001 | Imported | 2026-05-06 |
| 8 | deepseek/deepseek-chat | 0.64 | DeepSeek V3 deepseek-deepseek-chat | Imported | 2026-05-06 |
| 9 | google/gemini-flash-1.5 | 0.64 | — | Imported | 2026-05-06 |
| 10 | meta-llama/llama-4-maverick | 0.63 | Llama 4 Maverick meta-llama-4-maverick | Imported | 2026-05-06 |
| 11 | openai/gpt-4.1-mini | 0.60 | GPT-4.1 Mini openai-gpt-4.1-mini | Imported | 2026-05-06 |
| 12 | google/gemini-flash-1.5-8b | 0.56 | — | Imported | 2026-05-06 |
| 13 | openai/gpt-4o-mini | 0.55 | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-06 |
| 14 | meta-llama/llama-3.3-70b-instruct | 0.53 | Llama 3.3 70B Instruct meta-llama-llama-3.3-70b-instruct | Imported | 2026-05-06 |
| 15 | openai/gpt-4.1-nano | 0.52 | GPT-4.1 Nano openai-gpt-4.1-nano | Imported | 2026-05-06 |
| 16 | mistralai/mistral-saba | 0.51 | Mistral: Saba mistralai-mistral-saba | Imported | 2026-05-06 |
| 17 | deepseek/deepseek-chat-v3-0324 | 0.51 | DeepSeek V3 0324 deepseek-deepseek-chat-v3-0324 | Imported | 2026-05-06 |
| 18 | meta-llama/llama-3-70b-instruct | 0.51 | Llama 3 70B Instruct meta-llama-llama-3-70b-instruct | Imported | 2026-05-06 |
| 19 | meta-llama/llama-3.1-70b-instruct | 0.51 | Llama 3.1 70B Instruct meta-llama-llama-3.1-70b-instruct | Imported | 2026-05-06 |
| 20 | google/gemma-3-27b-it | 0.50 | Gemma 3 27B google-gemma-3-27b-it | Imported | 2026-05-06 |
| 21 | mistralai/mistral-small-3.1-24b-instruct | 0.49 | Mistral: Mistral Small 3.1 24B mistralai-mistral-small-3.1-24b-instruct | Imported | 2026-05-06 |
| 22 | microsoft/phi-4 | 0.45 | Phi 4 microsoft-phi-4 | Imported | 2026-05-06 |
| 23 | amazon/nova-micro-v1 | 0.43 | Nova Micro 1.0 amazon-nova-micro-v1 | Imported | 2026-05-06 |
| 24 | openai/gpt-3.5-turbo-0613 | 0.41 | GPT-3.5 Turbo (older v0613) openai-gpt-3.5-turbo-0613 | Imported | 2026-05-06 |
| 25 | mistralai/mistral-nemo | 0.37 | Mistral: Mistral Nemo mistralai-mistral-nemo | Imported | 2026-05-06 |
| 26 | microsoft/phi-4-multimodal-instruct | 0.32 | — | Imported | 2026-05-06 |
| 27 | gryphe/mythomax-l2-13b | 0.25 | MythoMax 13B gryphe-mythomax-l2-13b | Imported | 2026-05-06 |
| 28 | deepseek/deepseek-r1 | 0.17 | R1 deepseek-r1 | Imported | 2026-05-06 |
| 29 | qwen/qwen3-32b | 0.14 | Qwen3 32B qwen-qwen3-32b | Imported | 2026-05-06 |
| 30 | qwen/qwen3-235b-a22b | 0.13 | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-06 |
| 31 | deepseek/deepseek-r1-0528 | 0.12 | R1 0528 deepseek-deepseek-r1-0528 | Imported | 2026-05-06 |
| 32 | qwen/qwen3-30b-a3b | 0.08 | Qwen3 30B A3B qwen-qwen3-30b-a3b | Imported | 2026-05-06 |
| 33 | google/gemini-2.5-pro | 0.05 | Gemini 2.5 Pro google-gemini-2.5-pro | Imported | 2026-05-06 |
No matching rows.