Legal AI

Legal reasoning, contract, legal-agent, and jurisdiction-specific benchmark basket.

60models
9benchmarks

Ranked Models

Domain
# Model Avg Rank LegalBench CaseLaw v2 PRBL LEXam J1-ENVS HLA IslamicLeg RW PatentBenc
1 Claude Opus 4.7 Claude 3.9
4/9 rows
#9 #5 #1 #1
2 GPT-5.5 GPT 4.0
4/9 rows
#4 #7 #4 #2
3 GPT-5 GPT 4.1
5/9 rows
#6 #6 #5 #1 #1
4 GPT-5.1 GPT 4.3
3/9 rows
#7 #2 #3
5 Gemini 3.1 Pro Preview Gemini 5.7
4/9 rows
#1 #12 #9 #3
6 Claude Opus 4.6 Claude 6.9
4/9 rows
#8 #20 #1 #3
7 Gemini 2.5 Pro Gemini 8.5
5/9 rows
#15 #13 #2 #5 #5
8 GROK Grok 4.3 Grok 8.8
2/9 rows
#14 #1
9 Gemini 2.5 Flash Gemini 9.0
2/9 rows
#13 #3
10 GPT-5.4 GPT 9.3
3/9 rows
#5 #16 #9
11 Claude Opus 4.5 Claude 12.8
3/9 rows
#13 #18 #9
12 Claude Opus 4.8 Claude 14.0
2/9 rows
#27 #1
13 Claude Sonnet 4.5 Claude 14.3
4/9 rows
#20 #19 #13 #3
14 Qwen3 32B Qwen 14.5
2/9 rows
#26 #3
15 Gemini 3 Flash Preview Gemini 14.6
2/9 rows
#3 #32
16 o3 o-series 15.0
2/9 rows
#25 #5
17 Gemini 3.5 Flash Gemini 15.5
2/9 rows
#26 #5
18 Gemini 3 Gemini 15.6
4/9 rows
#2 #42 #14 #12
19 KIMI MoonshotAI: Kimi K2.6 Kimi 16.0
2/9 rows
#12 #22
20 Gemma 3 12B Gemma 17.0
2/9 rows
#24 #10
21 GROK Grok 4 Grok 17.1
3/9 rows
#30 #9 #6
22 KIMI MoonshotAI: Kimi K2.5 Kimi 17.2
2/9 rows
#28 #10
23 GPT-4.1 GPT 18.3
4/9 rows
#32 #3 #23 #6
24 Claude Sonnet 4.6 Claude 20.4
3/9 rows
#43 #14 #2
25 Claude Sonnet 4 Claude 21.6
2/9 rows
#34 #3
26 MiniMax M2.7 MiniMax 22.4
2/9 rows
#22 #23
27 Qwen3.5-Flash Qwen 22.6
2/9 rows
#17 #31
28 GPT-5 Mini GPT 23.1
3/9 rows
#48 #4 #5
29 GPT-5.2 GPT 24.8
2/9 rows
#36 #8
30 Claude Opus 4.1 Claude 25.5
2/9 rows
#28 #23
31 GPT-4o (2024-11-20) GPT 25.7
3/9 rows
#42 #26 #1
32 GLM GLM 5.1 GLM 28.2
2/9 rows
#15 #48
33 Gemini 3.1 Flash Lite Preview Gemini 28.8
2/9 rows
#24 #36
34 KIMI MoonshotAI: Kimi K2 Thinking Kimi 29.4
3/9 rows
#57 #11 #14
35 Qwen3.6 Plus Qwen 30.4
2/9 rows
#18 #49
36 GLM GLM 5 GLM 30.6
2/9 rows
#21 #45
37 GLM GLM 4.7 GLM 32.2
2/9 rows
#29 #37
38 GROK Grok 4.1 Fast Grok 34.2
2/9 rows
#41 #24
39 GROK Grok 4 Fast Grok 35.2
2/9 rows
#52 #10
40 gpt-oss-120b GPT 35.4
5/9 rows
#78 #50 #13 #15 #11
41 KIMI MoonshotAI: Kimi K2 0711 Kimi 36.0
2/9 rows
#49 #23
42 DeepSeek V3 DeepSeek 36.2
2/9 rows
#51 #14
43 Claude 3.7 Sonnet Claude 37.2
2/9 rows
#60 #3
44 R1 DeepSeek 38.8
4/9 rows
#95 #11 #13 #8
45 GPT-4.1 Mini GPT 40.0
3/9 rows
#71 #27 #13
46 o4 Mini o-series 41.5
2/9 rows
#65 #18
47 Claude Haiku 4.5 Claude 42.0
2/9 rows
#50 #30
48 Qwen3 235B A22B Qwen 42.0
2/9 rows
#58 #18
49 C Command A Command 42.4
2/9 rows
#62 #13
50 DeepSeek V4 Pro DeepSeek 44.4
2/9 rows
#56 #27
51 DeepSeek V3 0324 DeepSeek 47.4
2/9 rows
#75 #6
52 Mistral: Mistral Large 3 2512 Mistral 48.0
2/9 rows
#66 #21
53 Qwen3 Max Qwen 49.0
2/9 rows
#47 #52
54 o3-mini o-series 56.8
2/9 rows
#84 #16
55 GROK Grok 4.20 Grok 59.6
2/9 rows
#74 #38
56 gpt-oss-20b GPT 60.1
3/9 rows
#85 #54 #29
57 MiniMax M2.1 MiniMax 61.2
2/9 rows
#80 #33
58 GPT-5.4 Nano GPT 61.6
2/9 rows
#72 #46
59 GPT-5 Nano GPT 68.1
3/9 rows
#109 #44 #31
60 GPT-4.1 Nano GPT 69.8
2/9 rows
#103 #20

Benchmark Groups

Group Weight Benchmark Rows
Legal core 1.5x Professional Reasoning Bench - Legal 24
Legal core 1.5x LegalBench 69
Legal core 1.5x Harvey Legal Agent Benchmark 6
Legal core 1.5x Realm Warren 3
Legal breadth 1x IslamicLegalBench 7
Legal breadth 1x PatentBench 3
Legal breadth 1x J1-ENVS 11
Legal breadth 1x CaseLaw v2 48
Legal breadth 1x LEXam 22