LegalBench

Evaluating language models on a wide range of open source legal reasoning tasks.

111rows
scoreprimary metric
2026-05-28sampled

Metadata

Metrics

Score, Std. error (lower is better), Latency (lower is better), Cost per test (lower is better)

Latest Results

Full leaderboard rows decoded from the Vals.ai benchmark detail page. Primary score is the Overall accuracy percentage.

Rank Subject Score Model Match Provenance Sampled
1 Gemini 3.1 Pro Preview 87.398% Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-28
2 Gemini 3 Pro Preview 87.025% Gemini 3
google-gemini-3
Imported 2026-05-28
3 Gemini 3 Flash Preview 86.858% Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-28
4 GPT 5.5 86.515% GPT-5.5
openai-gpt-5.5
Imported 2026-05-28
5 GPT 5.4 2026-03-05 86.044% GPT-5.4
openai-gpt-5.4
Imported 2026-05-28
6 GPT 5.2025-08-07 86.023% GPT-5
openai-gpt-5
Imported 2026-05-28
7 GPT 5.1 2025-11-13 85.683% GPT-5.1
openai-gpt-5.1
Imported 2026-05-28
8 Claude Opus 4.6 Thinking 85.301% Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-28
9 Claude Opus 4.7 85.251% Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-28
10 Qwen 3.5 Plus Thinking 85.104% Imported 2026-05-28
11 Qwen 3.7 Max 84.913% Qwen3.7 Max
qwen-qwen3.7-max
Imported 2026-05-28
12 Kimi K2.6 Thinking 84.738% KIMI MoonshotAI: Kimi K2.6
moonshotai-kimi-k2.6
Imported 2026-05-28
13 Claude Opus 4.5 20251101 Thinking 84.604% Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-28
14 Grok 4.3 84.458% GROK Grok 4.3
x-ai-grok-4.3
Imported 2026-05-28
15 GLM 5.1 Thinking 84.394% GLM GLM 5.1
z-ai-glm-5.1
Imported 2026-05-28
16 Gemini 2.5 Pro Exp 03 25 84.322% Imported 2026-05-28
17 Qwen 3.5 Flash 84.276% Qwen3.5-Flash
qwen-qwen3.5-flash-02-23
Imported 2026-05-28
18 Qwen 3.6 Plus 84.233% Qwen3.6 Plus
qwen-qwen3.6-plus
Imported 2026-05-28
19 Muse Spark 84.217% Imported 2026-05-28
20 Claude Sonnet 4.5 20250929 Thinking 84.084% Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-28
21 GLM 5 Thinking 84.059% GLM GLM 5
z-ai-glm-5
Imported 2026-05-28
22 MiniMax M2.7 83.98% MiniMax M2.7
minimax-minimax-m2.7
Imported 2026-05-28
23 Gemini 2.5 Flash Preview 04 17 83.796% Imported 2026-05-28
24 Gemini 3.1 Flash Lite Preview 83.764% Gemini 3.1 Flash Lite Preview
google-gemini-3.1-flash-lite-preview
Imported 2026-05-28
25 O3 2025-04-16 83.761% o3
openai-o3
Imported 2026-05-28
26 Gemini 3.5 Flash 83.602% Gemini 3.5 Flash
google-gemini-3.5-flash
Imported 2026-05-28
27 Claude Opus 4.8 83.568% Claude Opus 4.8
anthropic-claude-opus-4.8
Imported 2026-05-28
28 Claude Opus 4.1 20250805 83.458% Claude Opus 4.1
anthropic-claude-opus-4.1
Imported 2026-05-28
29 GLM 4.7 83.358% GLM GLM 4.7
z-ai-glm-4.7
Imported 2026-05-28
30 Grok 4.0709 83.192% GROK Grok 4
x-ai-grok-4
Imported 2026-05-28
31 Grok 3 Mini Fast High Reasoning 83.143% Imported 2026-05-28
32 GPT 4.1 2025-04-14 83.1% GPT-4.1
openai-gpt-4.1
Imported 2026-05-28
33 Claude Opus 4.20250514 83.071% Claude Opus 4
anthropic-claude-opus-4
Imported 2026-05-28
34 Claude Sonnet 4.20250514 82.954% Claude Sonnet 4
anthropic-claude-sonnet-4
Imported 2026-05-28
35 Claude Opus 4.5 20251101 82.837% Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-28
36 GPT 5.2 2025-12-11 82.764% GPT-5.2
openai-gpt-5.2
Imported 2026-05-28
37 Gemini 2.5 Flash Preview 09 2025 Thinking 82.625% Imported 2026-05-28
38 Grok 3 82.595% GROK Grok 3
xaigrok-3
Imported 2026-05-28
39 Claude 3 7 Sonnet 20250219 Thinking 82.519% Imported 2026-05-28
40 Gemini 2.5 Flash Preview 09 2025 82.486% Imported 2026-05-28
41 Grok 4.1 Fast Reasoning 82.451% GROK Grok 4.1 Fast
x-ai-grok-4.1-fast
Imported 2026-05-28
42 GPT 4O 2024-11-20 82.214% GPT-4o (2024-11-20)
openai-gpt-4o-2024-11-20
Imported 2026-05-28
43 Claude Sonnet 4.6 82.12% Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-28
44 Claude Sonnet 4.20250514 Thinking 82.065% Imported 2026-05-28
45 Gemini 2.5 Flash Lite Preview 09 2025 Thinking 82.044% Imported 2026-05-28
46 Grok 3 Mini Fast Low Reasoning 81.925% Imported 2026-05-28
47 Qwen 3 Max 81.861% Qwen3 Max
qwen-qwen3-max
Imported 2026-05-28
48 GPT 5 Mini 2025-08-07 81.77% GPT-5 Mini
openai-gpt-5-mini
Imported 2026-05-28
49 Kimi K2 Instruct 81.454% KIMI MoonshotAI: Kimi K2 0711
moonshotai-kimi-k2
Imported 2026-05-28
50 Claude Haiku 4.5 20251001 Thinking 81.238% Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-28
51 DeepSeek V3 80.762% DeepSeek V3
deepseek-deepseek-chat
Imported 2026-05-28
52 Grok 4 Fast Reasoning 80.601% GROK Grok 4 Fast
x-ai-grok-4-fast
Imported 2026-05-28
53 GPT 4 Turbo 80.462% GPT-4 Turbo
openai-gpt-4-turbo
Imported 2026-05-28
54 O1 2024-12-17 80.393% o1
openai-o1
Imported 2026-05-28
55 Qwen 3 Max Preview 80.333% Imported 2026-05-28
56 DeepSeek V4 Pro 80.323% DeepSeek V4 Pro
deepseek-deepseek-v4-pro
Imported 2026-05-28
57 Kimi K2 Thinking 80.201% KIMI MoonshotAI: Kimi K2 Thinking
moonshotai-kimi-k2-thinking
Imported 2026-05-28
58 Qwen 3 235B A22b 80.179% Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-28
59 GPT 4O 2024-08-06 80.12% GPT-4o (2024-08-06)
openai-gpt-4o-2024-08-06
Imported 2026-05-28
60 Claude 3 7 Sonnet 20250219 80.001% Claude 3.7 Sonnet
anthropic-claude-3.7-sonnet
Imported 2026-05-28
61 MiniMax M2.5 Lightning 79.963% Imported 2026-05-28
62 Command A 03 2025 79.697% C Command A
cohere-command-a
Imported 2026-05-28
63 GLM 4.6 79.608% GLM GLM 4.6
z-ai-glm-4.6
Imported 2026-05-28
64 Qwen 2.5 72B Instruct Turbo 79.403% Imported 2026-05-28
65 O4 Mini 2025-04-16 79.185% o4 Mini
openai-o4-mini
Imported 2026-05-28
66 Mistral Large 2512 79.138% Mistral: Mistral Large 3 2512
mistralai-mistral-large-2512
Imported 2026-05-28
67 Grok 4.1 Fast Non Reasoning 79.113% GROK Grok 4.1 Fast
x-ai-grok-4.1-fast
Imported 2026-05-28
68 Gemini 2.5 Flash Lite Preview 09 2025 79.008% Gemini 2.5 Flash Lite Preview 09-2025
google-gemini-2.5-flash-lite-preview-09-2025
Imported 2026-05-28
69 Grok 4 Fast Non Reasoning 78.396% GROK Grok 4 Fast
x-ai-grok-4-fast
Imported 2026-05-28
70 Gemini 2.0 Flash 001 78.364% Gemini 2.0 Flash
google-gemini-2.0-flash
Imported 2026-05-28
71 GPT 4.1 Mini 2025-04-14 78.044% GPT-4.1 Mini
openai-gpt-4.1-mini
Imported 2026-05-28
72 GPT 5.4 Nano 2026-03-17 77.92% GPT-5.4 Nano
openai-gpt-5.4-nano
Imported 2026-05-28
73 Llama4 Maverick Instruct Basic 77.812% Imported 2026-05-28
74 Grok 4.20 0309 Reasoning 77.738% GROK Grok 4.20
x-ai-grok-4.20
Imported 2026-05-28
75 DeepSeek V3 0324 77.727% DeepSeek V3 0324
deepseek-deepseek-chat-v3-0324
Imported 2026-05-28
76 Llama 3.3 70B Instruct Turbo 77.18% Imported 2026-05-28
77 DeepSeek V3P2 Thinking 76.076% Imported 2026-05-28
78 GPT Oss 120B 75.938% gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-28
79 GLM 4.5 75.627% GLM GLM 4.5
z-ai-glm-4.5
Imported 2026-05-28
80 MiniMax M2.1 75.448% MiniMax M2.1
minimax-minimax-m2.1
Imported 2026-05-28
81 Jamba 1.5 Large 74.16% Imported 2026-05-28
82 Llama 4 Scout 17B 16E Instruct 72.036% Llama 4 Scout
meta-llama-llama-4-scout
Imported 2026-05-28
83 Jamba Large 1.6 71.962% Imported 2026-05-28
84 O3 Mini 2025-01-31 71.539% o3-mini
openai-o3-mini
Imported 2026-05-28
85 GPT Oss 20B 70.849% gpt-oss-20b
openai-gpt-oss-20b
Imported 2026-05-28
86 Claude 3 5 Haiku 20241022 70.331% Imported 2026-05-28
87 Gemini 1.0 Pro 002 70.247% Imported 2026-05-28
88 Jamba Mini 1.6 69.726% Imported 2026-05-28
89 Qwen 2.5 7B Instruct Turbo 69.559% Imported 2026-05-28
90 Mistral Small 2503 69.157% Imported 2026-05-28
91 Gemini 1.5 Pro 002 69.08% Imported 2026-05-28
92 Command R Plus 68.287% Imported 2026-05-28
93 Gemma 2 9B It 68.138% Imported 2026-05-28
94 Gemma 2 27B It 67.978% Gemma 2 27B
google-gemma-2-27b-it
Imported 2026-05-28
95 DeepSeek R1 67.323% R1
deepseek-r1
Imported 2026-05-28
96 Jamba 1.5 Mini 66.616% Imported 2026-05-28
97 DeepSeek V3P2 65.962% Imported 2026-05-28
98 GPT 3.5 Turbo 64.372% GPT-3.5 Turbo
openai-gpt-3.5-turbo
Imported 2026-05-28
99 Gemini 1.5 Flash 002 63.238% Imported 2026-05-28
100 Llama 2 70B Hf 62.434% Imported 2026-05-28
101 Mistral Medium 2505 61.979% Imported 2026-05-28
102 Command A Plus 05 2026 61.422% Imported 2026-05-28
103 GPT 4.1 Nano 2025-04-14 61.056% GPT-4.1 Nano
openai-gpt-4.1-nano
Imported 2026-05-28
104 Mixtral 8x7B V0.1 55.836% Imported 2026-05-28
105 Magistral Medium 2509 54.767% Imported 2026-05-28
106 Mistral 7B V0.1 53.77% Imported 2026-05-28
107 Llama 2 13B 53.696% Imported 2026-05-28
108 Llama 2 7B 51.869% Imported 2026-05-28
109 GPT 5 Nano 2025-08-07 50.129% GPT-5 Nano
openai-gpt-5-nano
Imported 2026-05-28
110 Magistral Small 2509 40.023% Imported 2026-05-28
111 Command R 32.965% C Command R (08-2024)
cohere-command-r-08-2024
Imported 2026-05-28