Legal AI
Legal reasoning, contract, legal-agent, and jurisdiction-specific benchmark basket.
60models
9benchmarks
| Domain | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| # | Model | Avg Rank | LegalBench | CaseLaw v2 | PRBL | LEXam | J1-ENVS | HLA | IslamicLeg | RW | PatentBenc |
| 1 | Claude Opus 4.7 Claude | 3.9 4/9 rows | #9 | #5 | — | — | — | #1 | — | #1 | — |
| 2 | GPT-5.5 GPT | 4.0 4/9 rows | #4 | #7 | — | — | — | #4 | — | #2 | — |
| 3 | GPT-5 GPT | 4.1 5/9 rows | #6 | #6 | #5 | #1 | — | — | #1 | — | — |
| 4 | GPT-5.1 GPT | 4.3 3/9 rows | #7 | #2 | #3 | — | — | — | — | — | — |
| 5 | Gemini 3.1 Pro Preview Gemini | 5.7 4/9 rows | #1 | #12 | #9 | — | — | — | — | #3 | — |
| 6 | Claude Opus 4.6 Claude | 6.9 4/9 rows | #8 | #20 | #1 | — | — | #3 | — | — | — |
| 7 | Gemini 2.5 Pro Gemini | 8.5 5/9 rows | — | #15 | #13 | #2 | — | — | #5 | — | #5 |
| 8 | Grok 4.3 Grok | 8.8 2/9 rows | #14 | #1 | — | — | — | — | — | — | — |
| 9 | Gemini 2.5 Flash Gemini | 9.0 2/9 rows | — | — | #13 | — | — | — | — | — | #3 |
| 10 | GPT-5.4 GPT | 9.3 3/9 rows | #5 | #16 | #9 | — | — | — | — | — | — |
| 11 | Claude Opus 4.5 Claude | 12.8 3/9 rows | #13 | #18 | #9 | — | — | — | — | — | — |
| 12 | Claude Opus 4.8 Claude | 14.0 2/9 rows | #27 | — | — | — | — | #1 | — | — | — |
| 13 | Claude Sonnet 4.5 Claude | 14.3 4/9 rows | #20 | #19 | #13 | — | — | — | #3 | — | — |
| 14 | Qwen3 32B Qwen | 14.5 2/9 rows | — | — | — | #26 | #3 | — | — | — | — |
| 15 | Gemini 3 Flash Preview Gemini | 14.6 2/9 rows | #3 | #32 | — | — | — | — | — | — | — |
| 16 | o3 o-series | 15.0 2/9 rows | #25 | — | #5 | — | — | — | — | — | — |
| 17 | Gemini 3.5 Flash Gemini | 15.5 2/9 rows | #26 | — | — | — | — | #5 | — | — | — |
| 18 | Gemini 3 Gemini | 15.6 4/9 rows | #2 | #42 | #14 | #12 | — | — | — | — | — |
| 19 | MoonshotAI: Kimi K2.6 Kimi | 16.0 2/9 rows | #12 | #22 | — | — | — | — | — | — | — |
| 20 | Gemma 3 12B Gemma | 17.0 2/9 rows | — | — | — | #24 | #10 | — | — | — | — |
| 21 | Grok 4 Grok | 17.1 3/9 rows | #30 | #9 | — | — | — | — | #6 | — | — |
| 22 | MoonshotAI: Kimi K2.5 Kimi | 17.2 2/9 rows | — | #28 | #10 | — | — | — | — | — | — |
| 23 | GPT-4.1 GPT | 18.3 4/9 rows | #32 | #3 | #23 | #6 | — | — | — | — | — |
| 24 | Claude Sonnet 4.6 Claude | 20.4 3/9 rows | #43 | #14 | — | — | — | #2 | — | — | — |
| 25 | Claude Sonnet 4 Claude | 21.6 2/9 rows | #34 | — | — | — | — | — | — | — | #3 |
| 26 | MiniMax M2.7 MiniMax | 22.4 2/9 rows | #22 | #23 | — | — | — | — | — | — | — |
| 27 | Qwen3.5-Flash Qwen | 22.6 2/9 rows | #17 | #31 | — | — | — | — | — | — | — |
| 28 | GPT-5 Mini GPT | 23.1 3/9 rows | #48 | #4 | — | #5 | — | — | — | — | — |
| 29 | GPT-5.2 GPT | 24.8 2/9 rows | #36 | #8 | — | — | — | — | — | — | — |
| 30 | Claude Opus 4.1 Claude | 25.5 2/9 rows | #28 | — | #23 | — | — | — | — | — | — |
| 31 | GPT-4o (2024-11-20) GPT | 25.7 3/9 rows | #42 | #26 | — | — | #1 | — | — | — | — |
| 32 | GLM 5.1 GLM | 28.2 2/9 rows | #15 | #48 | — | — | — | — | — | — | — |
| 33 | Gemini 3.1 Flash Lite Preview Gemini | 28.8 2/9 rows | #24 | #36 | — | — | — | — | — | — | — |
| 34 | MoonshotAI: Kimi K2 Thinking Kimi | 29.4 3/9 rows | #57 | #11 | #14 | — | — | — | — | — | — |
| 35 | Qwen3.6 Plus Qwen | 30.4 2/9 rows | #18 | #49 | — | — | — | — | — | — | — |
| 36 | GLM 5 GLM | 30.6 2/9 rows | #21 | #45 | — | — | — | — | — | — | — |
| 37 | GLM 4.7 GLM | 32.2 2/9 rows | #29 | #37 | — | — | — | — | — | — | — |
| 38 | Grok 4.1 Fast Grok | 34.2 2/9 rows | #41 | #24 | — | — | — | — | — | — | — |
| 39 | Grok 4 Fast Grok | 35.2 2/9 rows | #52 | #10 | — | — | — | — | — | — | — |
| 40 | gpt-oss-120b GPT | 35.4 5/9 rows | #78 | #50 | #13 | #15 | — | — | #11 | — | — |
| 41 | MoonshotAI: Kimi K2 0711 Kimi | 36.0 2/9 rows | #49 | — | #23 | — | — | — | — | — | — |
| 42 | DeepSeek V3 DeepSeek | 36.2 2/9 rows | #51 | — | — | #14 | — | — | — | — | — |
| 43 | Claude 3.7 Sonnet Claude | 37.2 2/9 rows | #60 | — | — | #3 | — | — | — | — | — |
| 44 | R1 DeepSeek | 38.8 4/9 rows | #95 | — | — | #11 | #13 | — | #8 | — | — |
| 45 | GPT-4.1 Mini GPT | 40.0 3/9 rows | #71 | — | #27 | #13 | — | — | — | — | — |
| 46 | o4 Mini o-series | 41.5 2/9 rows | #65 | — | #18 | — | — | — | — | — | — |
| 47 | Claude Haiku 4.5 Claude | 42.0 2/9 rows | #50 | #30 | — | — | — | — | — | — | — |
| 48 | Qwen3 235B A22B Qwen | 42.0 2/9 rows | #58 | — | — | #18 | — | — | — | — | — |
| 49 | Command A Command | 42.4 2/9 rows | #62 | #13 | — | — | — | — | — | — | — |
| 50 | DeepSeek V4 Pro DeepSeek | 44.4 2/9 rows | #56 | #27 | — | — | — | — | — | — | — |
| 51 | DeepSeek V3 0324 DeepSeek | 47.4 2/9 rows | #75 | — | — | — | #6 | — | — | — | — |
| 52 | Mistral: Mistral Large 3 2512 Mistral | 48.0 2/9 rows | #66 | #21 | — | — | — | — | — | — | — |
| 53 | Qwen3 Max Qwen | 49.0 2/9 rows | #47 | #52 | — | — | — | — | — | — | — |
| 54 | o3-mini o-series | 56.8 2/9 rows | #84 | — | — | #16 | — | — | — | — | — |
| 55 | Grok 4.20 Grok | 59.6 2/9 rows | #74 | #38 | — | — | — | — | — | — | — |
| 56 | gpt-oss-20b GPT | 60.1 3/9 rows | #85 | #54 | — | #29 | — | — | — | — | — |
| 57 | MiniMax M2.1 MiniMax | 61.2 2/9 rows | #80 | #33 | — | — | — | — | — | — | — |
| 58 | GPT-5.4 Nano GPT | 61.6 2/9 rows | #72 | #46 | — | — | — | — | — | — | — |
| 59 | GPT-5 Nano GPT | 68.1 3/9 rows | #109 | #44 | — | #31 | — | — | — | — | — |
| 60 | GPT-4.1 Nano GPT | 69.8 2/9 rows | #103 | — | — | #20 | — | — | — | — | — |
No matching rows.
| Group | Weight | Benchmark | Rows |
|---|---|---|---|
| Legal core | 1.5x | Professional Reasoning Bench - Legal | 24 |
| Legal core | 1.5x | LegalBench | 69 |
| Legal core | 1.5x | Harvey Legal Agent Benchmark | 6 |
| Legal core | 1.5x | Realm Warren | 3 |
| Legal breadth | 1x | IslamicLegalBench | 7 |
| Legal breadth | 1x | PatentBench | 3 |
| Legal breadth | 1x | J1-ENVS | 11 |
| Legal breadth | 1x | CaseLaw v2 | 48 |
| Legal breadth | 1x | LEXam | 22 |