TextClass Benchmark

TextClass Benchmark evaluates LLMs and transformers for social-science text classification across multiple domains and languages, reporting domain-specific Elo leaderboards and a weighted Meta-Elo aggregate.

112rows
meta_eloprimary metric
2026-05-06sampled

Metadata

Metrics

Meta-Elo, Weighted F1, Cycles

Latest Results

Rows are parsed from the public upstream Meta-Elo CSV. Source model and provider display names are preserved without canonical model mapping.

Rank Subject Meta-Elo Model Match Provenance Sampled
1 GPT-4o (2024-05-13) 1825.22 GPT-4o
openai-gpt-4o
Imported 2026-05-06
2 GPT-4o (2024-11-20) 1804.54 GPT-4o
openai-gpt-4o
Imported 2026-05-06
3 GPT-4o (2024-08-06) 1801.72 GPT-4o
openai-gpt-4o
Imported 2026-05-06
4 Gemini 1.5 Pro 1782.70 Imported 2026-05-06
5 GPT-4 Turbo (2024-04-09) 1781.47 GPT-4 Turbo
openai-gpt-4-turbo
Imported 2026-05-06
6 o1 (2024-12-17) 1768.81 o1
openai-o1
Imported 2026-05-06
7 GPT-4.5-preview (2025-02-27) 1767.86 GPT-4.5
openai-gpt-4.5-preview
Imported 2026-05-06
8 Grok 2 (1212) 1758.36 Imported 2026-05-06
9 Llama 3.1 (405B) 1755.81 Imported 2026-05-06
10 GPT-4 (0613) 1747.59 GPT-4
openai-gpt-4
Imported 2026-05-06
11 Llama 3.3 (70B-L) 1746.41 Imported 2026-05-06
12 Grok Beta 1741.94 Imported 2026-05-06
13 DeepSeek-V3 (671B) 1732.54 DeepSeek V3
deepseek-deepseek-chat
Imported 2026-05-06
14 Llama 3.1 (70B-L) 1722.88 Imported 2026-05-06
15 Mistral Large (2411) 1720.36 Mistral Large
mistralai-mistral-large
Imported 2026-05-06
16 DeepSeek-R1 (671B) 1718.73 R1
deepseek-r1
Imported 2026-05-06
17 Gemini 2.0 Flash 1701.95 Gemini 2.0 Flash
google-gemini-2.0-flash
Imported 2026-05-06
18 Pixtral Large (2411) 1697.33 Imported 2026-05-06
19 Gemini 2.0 Flash-Lite (02-05) 1687.68 Gemini 2.0 Flash Lite
google-gemini-2.0-flash-lite-001
Imported 2026-05-06
20 o3-mini (2025-01-31) 1684.99 o3-mini
openai-o3-mini
Imported 2026-05-06
21 Gemini 2.0 Flash Exp. 1682.46 Imported 2026-05-06
22 OpenThinker (32B-L) 1678.63 Imported 2026-05-06
23 Athene-V2 (72B-L) 1678.14 Imported 2026-05-06
24 Qwen 2.5 (32B-L) 1676.19 Imported 2026-05-06
25 GPT-4o mini (2024-07-18) 1674.64 GPT-4o-mini
openai-gpt-4o-mini
Imported 2026-05-06
26 Nemotron (70B-L) 1670.56 Imported 2026-05-06
27 Gemini 1.5 Flash 1668.98 Imported 2026-05-06
28 Gemma 3 (27B-L) 1665.66 Imported 2026-05-06
29 Qwen 2.5 (72B-L) 1659.57 Imported 2026-05-06
30 Gemma 3 (12B-L) 1646.63 Imported 2026-05-06
31 o1-mini (2024-09-12) 1626.65 Imported 2026-05-06
32 o3 (2025-04-16) 1625.36 o3
openai-o3
Imported 2026-05-06
33 o1-preview (2024-09-12) 1622.24 o1-preview
openai-o1-preview
Imported 2026-05-06
34 Mistral Saba 1620.99 Mistral: Saba
mistralai-mistral-saba
Imported 2026-05-06
35 GLM-4 (9B-L) 1616.51 Imported 2026-05-06
36 Phi-4 (14B-L) 1615.70 Phi 4
microsoft-phi-4
Imported 2026-05-06
37 Gemini 1.5 Flash (8B) 1611.69 Imported 2026-05-06
38 Gemma 2 (27B-L) 1610.06 Imported 2026-05-06
39 QwQ (32B-L) 1598.03 Imported 2026-05-06
40 Sailor2 (20B-L) 1595.95 Imported 2026-05-06
41 Hermes 3 (70B-L) 1593.23 Imported 2026-05-06
42 DeepSeek-R1 D-Qwen (14B-L) 1588.18 Imported 2026-05-06
43 Qwen 2.5 (14B-L) 1570.72 Imported 2026-05-06
44 Tülu3 (70B-L) 1569.12 Imported 2026-05-06
45 Open Mixtral 8x22B 1566.73 Imported 2026-05-06
46 Llama 3.1 (8B-L) 1561.25 Imported 2026-05-06
47 GPT-3.5 Turbo (0125) 1560.51 GPT-3.5 Turbo
openai-gpt-3.5-turbo
Imported 2026-05-06
48 DeepSeek-R1 D-Llama (8B-L) 1560.36 Imported 2026-05-06
49 Gemma 2 (9B-L) 1559.14 Imported 2026-05-06
50 OpenThinker (7B-L) 1552.76 Imported 2026-05-06
51 Notus (7B-L) 1549.73 Imported 2026-05-06
52 GPT-4.1 mini (2025-04-14) 1547.62 GPT-4.1 Mini
openai-gpt-4.1-mini
Imported 2026-05-06
53 Grok 3 Mini Beta 1546.47 GROK Grok 3 Mini Beta
x-ai-grok-3-mini-beta
Imported 2026-05-06
54 Grok 3 Beta 1545.77 GROK Grok 3 Beta
x-ai-grok-3-beta
Imported 2026-05-06
55 Grok 3 Fast Beta 1543.92 Imported 2026-05-06
56 Command R7B Arabic (7B-L) 1540.88 Imported 2026-05-06
57 Grok 3 Mini Fast Beta 1540.40 Imported 2026-05-06
58 o4-mini (2025-04-16) 1538.33 o4 Mini
openai-o4-mini
Imported 2026-05-06
59 Exaone 3.5 (32B-L) 1535.44 Imported 2026-05-06
60 Mistral Small (22B-L) 1533.44 Imported 2026-05-06
61 GPT-4.1 nano (2025-04-14) 1533.06 GPT-4.1 Nano
openai-gpt-4.1-nano
Imported 2026-05-06
62 Falcon3 (10B-L) 1532.01 Imported 2026-05-06
63 GPT-4.1 (2025-04-14) 1520.39 GPT-4.1
openai-gpt-4.1
Imported 2026-05-06
64 Gemini 2.5 Pro (03-25) 1517.98 Gemini 2.5 Pro
google-gemini-2.5-pro
Imported 2026-05-06
65 Mistral (7B-L) 1511.10 Imported 2026-05-06
66 Gemini 2.0 Flash-Lite (001) 1508.34 Gemini 2.0 Flash Lite
google-gemini-2.0-flash-lite-001
Imported 2026-05-06
67 OLMo 2 (13B-L) 1501.88 Imported 2026-05-06
68 OLMo 2 (7B-L) 1501.59 Imported 2026-05-06
69 Claude 3.7 Sonnet (20250219) 1500.76 Claude 3.7 Sonnet
anthropic-claude-3.7-sonnet
Imported 2026-05-06
70 Llama 4 Scout (107B) 1500.45 Llama 4 Scout
meta-llama-llama-4-scout
Imported 2026-05-06
71 Pixtral-12B (2409) 1490.38 Imported 2026-05-06
72 Nous Hermes 2 (11B-L) 1488.67 Imported 2026-05-06
73 Yi 1.5 (34B-L) 1485.99 Imported 2026-05-06
74 Mistral Small 3.1 1484.82 Imported 2026-05-06
75 Qwen 2.5 (7B-L) 1477.35 Imported 2026-05-06
76 Phi-4-mini (3.8B-L) 1477.03 Imported 2026-05-06
77 Llama 4 Maverick (400B) 1473.88 Llama 4 Maverick
meta-llama-4-maverick
Imported 2026-05-06
78 Yi Large 1473.21 Imported 2026-05-06
79 Granite 3.2 (8B-L) 1446.65 Imported 2026-05-06
80 Aya Expanse (32B-L) 1445.04 Imported 2026-05-06
81 Marco-o1-CoT (7B-L) 1443.46 Imported 2026-05-06
82 Aya (35B-L) 1436.83 Imported 2026-05-06
83 Granite 3.1 (8B-L) 1429.95 Imported 2026-05-06
84 Gemma 3 (4B-L) 1428.70 Imported 2026-05-06
85 Aya Expanse (8B-L) 1425.23 Imported 2026-05-06
86 Mistral NeMo (12B-L) 1420.94 Mistral: Mistral Nemo
mistralai-mistral-nemo
Imported 2026-05-06
87 Orca 2 (7B-L) 1415.85 Imported 2026-05-06
88 Nemotron-Mini (4B-L) 1414.69 Imported 2026-05-06
89 Claude 3.5 Haiku (20241022) 1413.90 Claude 3.5 Haiku
anthropic-claude-3.5-haiku
Imported 2026-05-06
90 Mistral OpenOrca (7B-L) 1396.97 Imported 2026-05-06
91 Tülu3 (8B-L) 1396.65 Imported 2026-05-06
92 Hermes 3 (8B-L) 1386.51 Imported 2026-05-06
93 Yi 1.5 (9B-L) 1385.39 Imported 2026-05-06
94 Claude 3.5 Sonnet (20241022) 1384.79 Claude 3.5 Sonnet
anthropic-claude-3.5-sonnet
Imported 2026-05-06
95 Dolphin 3.0 (8B-L) 1381.14 Imported 2026-05-06
96 Exaone 3.5 (8B-L) 1371.68 Imported 2026-05-06
97 Ministral-8B (2410) 1346.45 Imported 2026-05-06
98 Llama 3.2 (3B-L) 1314.58 Imported 2026-05-06
99 Codestral Mamba (7B) 1312.34 Imported 2026-05-06
100 Nous Hermes 2 Mixtral (47B-L) 1281.37 Imported 2026-05-06
101 Solar Pro (22B-L) 1224.78 Imported 2026-05-06
102 DeepSeek-R1 D-Qwen (7B-L) 1212.76 Imported 2026-05-06
103 Phi-3 Medium (14B-L) 1209.37 Imported 2026-05-06
104 Perspective 0.55 1180.28 Imported 2026-05-06
105 Perspective 0.60 1094.77 Imported 2026-05-06
106 Yi 1.5 (6B-L) 1086.23 Imported 2026-05-06
107 Granite 3 MoE (3B-L) 1084.42 Imported 2026-05-06
108 Perspective 0.70 1055.31 Imported 2026-05-06
109 DeepSeek-R1 D-Qwen (1.5B-L) 951.93 Imported 2026-05-06
110 DeepScaleR (1.5B-L) 892.60 Imported 2026-05-06
111 Perspective 0.80 869.91 Imported 2026-05-06
112 Granite 3.1 MoE (3B-L) 758.13 Imported 2026-05-06