Lech Mazur Writing

Writing quality and style evaluation benchmark tracked in Epoch AI's capabilities dataset.

27rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Standard error (lower is better)

Latest Results

Rows parsed from the public leaderboard table.

Rank Subject Score Model Match Provenance Sampled
1 GPT-5.2 8.72 GPT-5.2
openai-gpt-5.2
Imported 2026-05-06
2 Qwen3-Max-Instruct 8.71 Qwen3 Max
qwen-qwen3-max
Imported 2026-05-06
3 kimi-k2-thinking (official) 8.69 KIMI MoonshotAI: Kimi K2 Thinking
moonshotai-kimi-k2-thinking
Imported 2026-05-06
4 o3 8.63 o3
openai-o3
Imported 2026-05-06
5 Gemini 2.5 Pro (Jun 2025) 8.60 Gemini 2.5 Pro
google-gemini-2.5-pro
Imported 2026-05-06
6 Claude Opus 4.5 8.54 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-06
7 DeepSeek V3 8.52 DeepSeek V3
deepseek-deepseek-chat
Imported 2026-05-06
8 Qwen 3 235B 8.49 Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-06
9 Qwen3-235B-A22B 8.30 Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-06
10 DeepSeek R1 8.30 R1
deepseek-r1
Imported 2026-05-06
11 GPT-4o 8.18 GPT-4o
openai-gpt-4o
Imported 2026-05-06
12 Grok 4 8.11 GROK Grok 4
x-ai-grok-4
Imported 2026-05-06
13 Claude 3.7 Sonnet 8.11 Claude 3.7 Sonnet
anthropic-claude-3.7-sonnet
Imported 2026-05-06
14 QwQ-32B 8.07 Imported 2026-05-06
15 Gemma 3 27B 7.99 Gemma 3 27B
google-gemma-3-27b-it
Imported 2026-05-06
16 GPT-OSS 120B 7.73 gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-06
17 Grok-3 mini 7.64 GROK Grok 3 Mini
x-ai-grok-3-mini
Imported 2026-05-06
18 GPT-4.1 7.56 GPT-4.1
openai-gpt-4.1
Imported 2026-05-06
19 o4-mini-2025-04-16 medium 7.50 o4 Mini
openai-o4-mini
Imported 2026-05-06
20 Gemini 2.0 Flash Thinking Exp 7.38 Gemini 2.0 Flash
google-gemini-2.0-flash
Imported 2026-05-06
21 Claude 3.5 Haiku 7.35 Claude 3.5 Haiku
anthropic-claude-3.5-haiku
Imported 2026-05-06
22 Qwen2.5-Max 7.29 Imported 2026-05-06
23 o1 7.02 o1
openai-o1
Imported 2026-05-06
24 Mistral Large 6.90 Mistral Large
mistralai-mistral-large
Imported 2026-05-06
25 gpt-4o-mini-2024-07-18 6.72 GPT-4o-mini (2024-07-18)
openai-gpt-4o-mini-2024-07-18
Imported 2026-05-06
26 Llama-4-Maverick-17B-128E-Instruct 6.37 Llama 4 Maverick
meta-llama-4-maverick
Imported 2026-05-06
27 Phi-4 6.26 Phi 4
microsoft-phi-4
Imported 2026-05-06