INCLUDE-base-44 European Languages

European-language slice of INCLUDE-base-44, evaluating multilingual LLMs on knowledge- and reasoning-centric multiple-choice questions across 20 European languages.

35rows
average_accuracyprimary metric
2026-05-06sampled

Metadata

Metrics

Average Accuracy, Albanian Accuracy, Belarusian Accuracy, Bulgarian Accuracy, Croatian Accuracy, Dutch Accuracy, Estonian Accuracy, Finnish Accuracy, French Accuracy, German Accuracy, Greek Accuracy, Hungarian Accuracy, Italian Accuracy, Lithuanian Accuracy, Polish Accuracy, Portuguese Accuracy, Russian Accuracy, Serbian Accuracy, Spanish Accuracy, Turkish Accuracy, Ukrainian Accuracy

Latest Results

Rows are parsed from the public Space results.csv. Average accuracy is computed across the 20 European language rows described by the Space.

Rank Subject Average Accuracy Model Match Provenance Sampled
1 GaMS3-12B-Instruct,bos 0.66 Imported 2026-05-06
2 Qwen3-14B 0.65 Qwen3 14B
qwen-qwen3-14b
Imported 2026-05-06
3 Bielik-11B-v3.0-Instruct 0.65 Imported 2026-05-06
4 cogito-v1-preview-qwen-14B 0.64 Imported 2026-05-06
5 gemma-3-12b-it,bos 0.64 Imported 2026-05-06
6 Qwen2.5-14B-Instruct 0.62 Imported 2026-05-06
7 Qwen2.5-14B 0.61 Imported 2026-05-06
8 Qwen3-8B 0.61 Qwen3 8B
qwen-qwen3-8b
Imported 2026-05-06
9 phi-4 0.59 Phi 4
microsoft-phi-4
Imported 2026-05-06
10 Apertus-8B-Instruct-2509 0.58 Imported 2026-05-06
11 Llama-3.1-8B-Instruct 0.55 Llama 3.1 8B Instruct
meta-llama-llama-3.1-8b-instruct
Imported 2026-05-06
12 EuroLLM-9B-Instruct 0.55 Imported 2026-05-06
13 Qwen2.5-7B 0.55 Imported 2026-05-06
14 Apertus-8B-2509 0.55 Imported 2026-05-06
15 Qwen2.5-7B-Instruct 0.54 Qwen2.5 7B Instruct
qwen-qwen-2.5-7b-instruct
Imported 2026-05-06
16 Mistral-Nemo-Instruct-2407 0.53 Mistral: Mistral Nemo
mistralai-mistral-nemo
Imported 2026-05-06
17 Bielik-11B-v2.6-Instruct 0.51 Imported 2026-05-06
18 Bielik-11B-v2.5-Instruct 0.51 Imported 2026-05-06
19 Mistral-Nemo-Base-2407 0.51 Imported 2026-05-06
20 Bielik-11B-v2.1-Instruct 0.51 Imported 2026-05-06
21 Bielik-11B-v2.3-Instruct 0.51 Imported 2026-05-06
22 Bielik-11B-v2.2-Instruct 0.51 Imported 2026-05-06
23 EuroLLM-9B 0.49 Imported 2026-05-06
24 Mistral-7B-Instruct-v0.2 0.45 Imported 2026-05-06
25 aya-expanse-8b 0.45 Imported 2026-05-06
26 Bielik-11B-v2 0.45 Imported 2026-05-06
27 PLLuM-12B-chat 0.45 Imported 2026-05-06
28 pllum-12b-nc-chat-250715 0.44 Imported 2026-05-06
29 pllum-12b-nc-instruct-250715 0.43 Imported 2026-05-06
30 Mistral-7B-v0.2 0.42 Imported 2026-05-06
31 pllum-12b-nc-base-250715 0.38 Imported 2026-05-06
32 Bielik-4.5B-v3 0.36 Imported 2026-05-06
33 PLLuM-12B-base-250801 0.35 Imported 2026-05-06
34 Bielik-7B-Instruct-v0.1 0.34 Imported 2026-05-06
35 Llama-PLLuM-8B-base-250801 0.30 Imported 2026-05-06