AraGen v3

Arabic generative-task benchmark for chat and instruct models, evaluated with the 3C3H rubric over culturally relevant Arabic QA and safety tasks.

69rows
3c3h_scoreprimary metric
2026-05-06sampled

Metadata

Metrics

3C3H Score, Correctness, Completeness, Conciseness, Helpfulness, Honesty, Harmlessness, Question Answering (QA), Safety

Latest Results

Rows are parsed from the public result JSON in the Inception multilingual leaderboard Space. Values are converted to percentages and confidence intervals are preserved when present.

Rank Subject 3C3H Score Model Match Provenance Sampled
1 o1-2024-12-17 84.29 o1
openai-o1
Imported 2026-05-06
2 gpt-5-2025-08-07 84.25 GPT-5
openai-gpt-5
Imported 2026-05-06
3 o3-2025-04-16 82.19 o3
openai-o3
Imported 2026-05-06
4 claude-opus-4-20250514 80.96 Claude Opus 4
anthropic-claude-opus-4
Imported 2026-05-06
5 claude-opus-4-5-20251101 80.29 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-06
6 claude-sonnet-4-5-20250929 78.17 Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-06
7 claude-3-7-sonnet-20250219 78.16 Claude 3.7 Sonnet
anthropic-claude-3.7-sonnet
Imported 2026-05-06
8 claude-sonnet-4-20250514 75.58 Claude Sonnet 4
anthropic-claude-sonnet-4
Imported 2026-05-06
9 gpt-4.1 74.54 GPT-4.1
openai-gpt-4.1
Imported 2026-05-06
10 google/gemma-4-31B-it 72.71 Gemma 4 31B
google-gemma-4-31b-it
Imported 2026-05-06
11 o4-mini-2025-04-16 70.60 o4 Mini
openai-o4-mini
Imported 2026-05-06
12 o1-mini-2024-09-12 70.46 Imported 2026-05-06
13 gemini-2.5-pro-preview-05-06 68.37 Gemini 2.5 Pro Preview 05-06
google-gemini-2.5-pro-preview-05-06
Imported 2026-05-06
14 google/gemma-4-26B-A4B-it 67.35 Gemma 4 26B A4B
google-gemma-4-26b-a4b-it
Imported 2026-05-06
15 claude-haiku-4-5-20251001 65.80 Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-06
16 gemini-3-pro-preview 64.15 Gemini 3
google-gemini-3
Imported 2026-05-06
17 o3-mini-2025-01-31 59.81 o3-mini
openai-o3-mini
Imported 2026-05-06
18 inceptionai/Jais-2-70B-Chat 52.39 Imported 2026-05-06
19 meta-llama/Llama-3.3-70B-Instruct 52.12 Llama 3.3 70B Instruct
meta-llama-llama-3.3-70b-instruct
Imported 2026-05-06
20 mistralai/Mistral-Large-Instruct-2411 52.02 Imported 2026-05-06
21 MaziyarPanahi/calme-2.1-qwen2.5-72b 50.81 Imported 2026-05-06
22 gpt-4o-mini-2024-07-18 50.74 GPT-4o-mini (2024-07-18)
openai-gpt-4o-mini-2024-07-18
Imported 2026-05-06
23 meta-llama/Llama-3.2-90B-Vision-Instruct 50.51 Imported 2026-05-06
24 MaziyarPanahi/calme-2.2-qwen2.5-72b 50.31 Imported 2026-05-06
25 meta-llama/Llama-3.1-70B-Instruct 50 Llama 3.1 70B Instruct
meta-llama-llama-3.1-70b-instruct
Imported 2026-05-06
26 Qwen/Qwen2.5-72B-Instruct 48.92 Qwen2.5 72B Instruct
qwen-qwen-2.5-72b-instruct
Imported 2026-05-06
27 zai-org/GLM-5 46.01 GLM GLM 5
z-ai-glm-5
Imported 2026-05-06
28 tiiuae/Falcon-H1-34B-Instruct 45.74 Imported 2026-05-06
29 JasperV13/Yehia-7B-Reasoning-preview 45.25 Imported 2026-05-06
30 JasperV13/Yehia-7B-Reasoning-preview 45 Imported 2026-05-06
31 openai/gpt-oss-120b 43.23 gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-06
32 mistralai/Mistral-Small-24B-Instruct-2501 39.84 Mistral: Mistral Small 3
mistralai-mistral-small-24b-instruct-2501
Imported 2026-05-06
33 Qwen/Qwen2.5-14B-Instruct 36.90 Imported 2026-05-06
34 Qwen/Qwen3-32B 35.18 Qwen3 32B
qwen-qwen3-32b
Imported 2026-05-06
35 FreedomIntelligence/AceGPT-v2-8B-Chat 35.03 Imported 2026-05-06
36 inceptionai/Jais-2-8B-Chat 33.95 Imported 2026-05-06
37 mistralai/Mistral-Small-4-119B-2603 33.85 Imported 2026-05-06
38 inceptionai/jais-family-30b-16k-chat 32.33 Imported 2026-05-06
39 mistralai/Magistral-Small-2509 32.31 Imported 2026-05-06
40 inceptionai/jais-family-30b-8k-chat 30.77 Imported 2026-05-06
41 openai/gpt-oss-20b 30.61 gpt-oss-20b
openai-gpt-oss-20b
Imported 2026-05-06
42 microsoft/phi-4 29.98 Phi 4
microsoft-phi-4
Imported 2026-05-06
43 Qwen/Qwen3-14B 29.74 Qwen3 14B
qwen-qwen3-14b
Imported 2026-05-06
44 CohereForAI/c4ai-command-r7b-arabic-02-2025 27.86 Imported 2026-05-06
45 Qwen/Qwen3-8B 26.55 Qwen3 8B
qwen-qwen3-8b
Imported 2026-05-06
46 Qwen/Qwen2.5-7B-Instruct 26.25 Qwen2.5 7B Instruct
qwen-qwen-2.5-7b-instruct
Imported 2026-05-06
47 meta-llama/Meta-Llama-3-70B-Instruct 26.14 Llama 3 70B Instruct
meta-llama-llama-3-70b-instruct
Imported 2026-05-06
48 meta-llama/Llama-3.1-8B-Instruct 24.94 Llama 3.1 8B Instruct
meta-llama-llama-3.1-8b-instruct
Imported 2026-05-06
49 meta-llama/Llama-3.2-11B-Vision-Instruct 24.42 Llama 3.2 11B Vision Instruct
meta-llama-llama-3.2-11b-vision-instruct
Imported 2026-05-06
50 tiiuae/Falcon-H1-7B-Instruct 24.40 Imported 2026-05-06
51 inceptionai/jais-family-13b-chat 24.14 Imported 2026-05-06
52 inceptionai/jais-family-6p7b-chat 21.29 Imported 2026-05-06
53 mistralai/Ministral-8B-Instruct-2410 17.61 Imported 2026-05-06
54 HuggingFaceTB/SmolLM3-3B 16.24 Imported 2026-05-06
55 QCRI/Fanar-1-9B-Instruct 16.18 Imported 2026-05-06
56 FreedomIntelligence/AceGPT-13B-chat 15.91 Imported 2026-05-06
57 tiiuae/Falcon-H1-3B-Instruct 15.05 Imported 2026-05-06
58 meta-llama/Llama-3.2-3B-Instruct 14.99 Llama 3.2 3B Instruct
meta-llama-llama-3.2-3b-instruct
Imported 2026-05-06
59 Qwen/Qwen2.5-1.5B-Instruct 13.22 Imported 2026-05-06
60 FreedomIntelligence/AceGPT-7B-chat 12.45 Imported 2026-05-06
61 meta-llama/Llama-3.2-1B-Instruct 10.54 Llama 3.2 1B Instruct
meta-llama-llama-3.2-1b-instruct
Imported 2026-05-06
62 meta-llama/Meta-Llama-3-8B-Instruct 9.21 Llama 3 8B Instruct
meta-llama-llama-3-8b-instruct
Imported 2026-05-06
63 silma-ai/SILMA-9B-Instruct-v1.0 6.70 Imported 2026-05-06
64 Qwen/Qwen2.5-0.5B-Instruct 5.21 Imported 2026-05-06
65 mistralai/Mistral-7B-Instruct-v0.3 5.08 Imported 2026-05-06
66 tiiuae/Falcon-H1-1.5B-Instruct 4.58 Imported 2026-05-06
67 Qwen/Qwen3-0.6B 1.57 Imported 2026-05-06
68 mistralai/Mistral-7B-Instruct-v0.2 0.48 Imported 2026-05-06
69 Qwen/Qwen3.5-397B-A17B-FP8 0 Imported 2026-05-06