LiveCodeBench

Our Implementation of the LiveCodeBench benchmark

113rows
scoreprimary metric
2026-05-28sampled

Metadata

Metrics

Score, Std. error (lower is better), Latency (lower is better), Cost per test (lower is better)

Latest Results

Full leaderboard rows decoded from the Vals.ai benchmark detail page. Primary score is the Overall accuracy percentage.

Rank Subject Score Model Match Provenance Sampled
1 Gemini 3.1 Pro Preview 88.485% Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-28
2 GPT 5.2 Codex 87.993% GPT-5.2-Codex
openai-gpt-5.2-codex
Imported 2026-05-28
3 Claude Opus 4.8 87.819% Claude Opus 4.8
anthropic-claude-opus-4.8
Imported 2026-05-28
4 Gemini 3.5 Flash 87.604% Gemini 3.5 Flash
google-gemini-3.5-flash
Imported 2026-05-28
5 DeepSeek V4 Pro 87.484% DeepSeek V4 Pro
deepseek-deepseek-v4-pro
Imported 2026-05-28
6 GPT 5.3 Codex 87.313% GPT-5.3-Codex
openai-gpt-5.3-codex
Imported 2026-05-28
7 Qwen 3.7 Max 87.057% Qwen3.7 Max
qwen-qwen3.7-max
Imported 2026-05-28
8 Kimi K2.6 Thinking 86.771% KIMI MoonshotAI: Kimi K2.6
moonshotai-kimi-k2.6
Imported 2026-05-28
9 GPT 5 Mini 2025-08-07 86.605% GPT-5 Mini
openai-gpt-5-mini
Imported 2026-05-28
10 GPT 5.1 2025-11-13 86.486% GPT-5.1
openai-gpt-5.1
Imported 2026-05-28
11 Gemini 3 Pro Preview 86.407% Gemini 3
google-gemini-3
Imported 2026-05-28
12 Qwen 3.6 Plus 85.952% Qwen3.6 Plus
qwen-qwen3.6-plus
Imported 2026-05-28
13 GPT 5.2025-08-07 85.911% GPT-5
openai-gpt-5
Imported 2026-05-28
14 Gemini 3 Flash Preview 85.591% Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-28
15 GPT 5.1 Codex 85.55% GPT-5.1-Codex
openai-gpt-5.1-codex
Imported 2026-05-28
16 GPT 5.2 2025-12-11 85.361% GPT-5.2
openai-gpt-5.2
Imported 2026-05-28
17 Qwen 3.5 Plus Thinking 85.326% Imported 2026-05-28
18 GPT 5.5 85.296% GPT-5.5
openai-gpt-5.5
Imported 2026-05-28
19 Claude Opus 4.7 85.073% Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-28
20 GPT 5 Codex 84.725% GPT-5 Codex
openai-gpt-5-codex
Imported 2026-05-28
21 Claude Opus 4.6 Thinking 84.676% Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-28
22 Grok 4.3 84.494% GROK Grok 4.3
x-ai-grok-4.3
Imported 2026-05-28
23 Grok 4.20 0309 Reasoning 84.265% GROK Grok 4.20
x-ai-grok-4.20
Imported 2026-05-28
24 GPT 5.4 2026-03-05 84.141% GPT-5.4
openai-gpt-5.4
Imported 2026-05-28
25 GPT 5.4 Nano 2026-03-17 84.009% GPT-5.4 Nano
openai-gpt-5.4-nano
Imported 2026-05-28
26 O3 2025-04-16 83.914% o3
openai-o3
Imported 2026-05-28
27 Kimi K2.5 Thinking 83.868% KIMI MoonshotAI: Kimi K2.5
moonshotai-kimi-k2.5
Imported 2026-05-28
28 Claude Opus 4.5 20251101 Thinking 83.67% Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-28
29 GPT 5.1 Codex Max 83.558% GPT-5.1-Codex-Max
openai-gpt-5.1-codex-max
Imported 2026-05-28
30 Qwen 3.5 Flash 83.28% Qwen3.5-Flash
qwen-qwen3.5-flash-02-23
Imported 2026-05-28
31 Grok 4.0709 83.247% GROK Grok 4
x-ai-grok-4
Imported 2026-05-28
32 GPT Oss 120B 83.234% gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-28
33 GLM 4.7 82.234% GLM GLM 4.7
z-ai-glm-4.7
Imported 2026-05-28
34 O4 Mini 2025-04-16 82.208% o4 Mini
openai-o4-mini
Imported 2026-05-28
35 Claude Sonnet 4.6 82.091% Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-28
36 GLM 5 Thinking 81.868% GLM GLM 5
z-ai-glm-5
Imported 2026-05-28
37 MiniMax M2.1 81.756% MiniMax M2.1
minimax-minimax-m2.1
Imported 2026-05-28
38 GPT 5.4 Mini 2026-03-17 81.465% GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-28
39 GLM 5.1 Thinking 81.38% GLM GLM 5.1
z-ai-glm-5.1
Imported 2026-05-28
40 GLM 4.6 81.036% GLM GLM 4.6
z-ai-glm-4.6
Imported 2026-05-28
41 DeepSeek V3P2 Thinking 80.695% Imported 2026-05-28
42 Grok 4.1 Fast Reasoning 80.641% GROK Grok 4.1 Fast
x-ai-grok-4.1-fast
Imported 2026-05-28
43 GPT Oss 20B 80.387% gpt-oss-20b
openai-gpt-oss-20b
Imported 2026-05-28
44 Gemini 3.1 Flash Lite Preview 80.116% Gemini 3.1 Flash Lite Preview
google-gemini-3.1-flash-lite-preview
Imported 2026-05-28
45 MiniMax M2.7 79.926% MiniMax M2.7
minimax-minimax-m2.7
Imported 2026-05-28
46 Command A Plus 05 2026 79.382% Imported 2026-05-28
47 MiniMax M2.5 Lightning 79.208% Imported 2026-05-28
48 Gemini 2.5 Pro Preview 03 25 79.164% Gemini 2.5 Pro Preview 05-06
google-gemini-2.5-pro-preview-05-06
Imported 2026-05-28
49 Grok 4 Fast Reasoning 78.973% GROK Grok 4 Fast
x-ai-grok-4-fast
Imported 2026-05-28
50 Qwen 3 Max 78.215% Qwen3 Max
qwen-qwen3-max
Imported 2026-05-28
51 Grok 3 Mini Fast High Reasoning 76.22% Imported 2026-05-28
52 Gemini 2.5 Flash Preview 09 2025 Thinking 76.214% Imported 2026-05-28
53 Gemini 2.5 Flash Preview 09 2025 75.063% Imported 2026-05-28
54 Claude Opus 4.5 20251101 75.034% Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-28
55 Magistral Medium 2509 74.86% Imported 2026-05-28
56 Claude Sonnet 4.5 20250929 Thinking 72.996% Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-28
57 Magistral Small 2509 72.131% Imported 2026-05-28
58 O3 Mini 2025-01-31 71.484% o3-mini
openai-o3-mini
Imported 2026-05-28
59 Gemini 2.5 Flash Lite Preview 09 2025 Thinking 71.385% Imported 2026-05-28
60 Qwen 3 235B A22b 70.62% Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-28
61 Kimi K2 Instruct 70.449% KIMI MoonshotAI: Kimi K2 0711
moonshotai-kimi-k2
Imported 2026-05-28
62 DeepSeek R1 70.221% R1
deepseek-r1
Imported 2026-05-28
63 GPT 5 Nano 2025-08-07 70.216% GPT-5 Nano
openai-gpt-5-nano
Imported 2026-05-28
64 Claude Opus 4.20250514 Thinking 70.188% Imported 2026-05-28
65 DeepSeek V3P2 69.856% Imported 2026-05-28
66 Gemini 2.5 Flash Lite Preview 09 2025 67.669% Gemini 2.5 Flash Lite Preview 09-2025
google-gemini-2.5-flash-lite-preview-09-2025
Imported 2026-05-28
67 GLM 4.5 67.446% GLM GLM 4.5
z-ai-glm-4.5
Imported 2026-05-28
68 Qwen 3 Max Preview 66.91% Imported 2026-05-28
69 Claude Opus 4.1 20250805 Thinking 66.456% Claude Opus 4.1
anthropic-claude-opus-4.1
Imported 2026-05-28
70 Grok 3 Mini Fast Low Reasoning 66.265% Imported 2026-05-28
71 DeepSeek V3 0324 65.478% DeepSeek V3 0324
deepseek-deepseek-chat-v3-0324
Imported 2026-05-28
72 Claude Opus 4.1 20250805 64.559% Claude Opus 4.1
anthropic-claude-opus-4.1
Imported 2026-05-28
73 Kimi K2 Thinking 63.145% KIMI MoonshotAI: Kimi K2 Thinking
moonshotai-kimi-k2-thinking
Imported 2026-05-28
74 Claude Opus 4.20250514 62.629% Claude Opus 4
anthropic-claude-opus-4
Imported 2026-05-28
75 Claude Sonnet 4.20250514 Thinking 62.392% Imported 2026-05-28
76 Grok Code Fast 1 61.969% GROK Grok Code Fast 1
x-ai-grok-code-fast-1
Imported 2026-05-28
77 Claude 3 7 Sonnet 20250219 Thinking 60.436% Imported 2026-05-28
78 Claude Sonnet 4.20250514 59.673% Claude Sonnet 4
anthropic-claude-sonnet-4
Imported 2026-05-28
79 Llama 3.3 Nemotron Super 49B V1 42e84561 Thinking 58.369% Imported 2026-05-28
80 GPT 4.1 Mini 2025-04-14 58.158% GPT-4.1 Mini
openai-gpt-4.1-mini
Imported 2026-05-28
81 Gemini 2.5 Flash Preview 04 17 56.936% Imported 2026-05-28
82 Claude 3 7 Sonnet 20250219 56.662% Claude 3.7 Sonnet
anthropic-claude-3.7-sonnet
Imported 2026-05-28
83 Mistral Large 2512 55.337% Mistral: Mistral Large 3 2512
mistralai-mistral-large-2512
Imported 2026-05-28
84 GPT 4.1 2025-04-14 54.666% GPT-4.1
openai-gpt-4.1
Imported 2026-05-28
85 Grok 3 52.901% GROK Grok 3
xaigrok-3
Imported 2026-05-28
86 Devstral 2512 51.841% Mistral: Devstral 2 2512
mistralai-devstral-2512
Imported 2026-05-28
87 O1 2024-12-17 50.264% o1
openai-o1
Imported 2026-05-28
88 Claude 3 5 Sonnet 20241022 49.628% Claude 3.5 Sonnet
anthropic-claude-3.5-sonnet
Imported 2026-05-28
89 Llama4 Maverick Instruct Basic 47.251% Imported 2026-05-28
90 Gemini 2.5 Flash Preview 04 17 Thinking 46.871% Imported 2026-05-28
91 Grok 4 Fast Non Reasoning 46.095% GROK Grok 4 Fast
x-ai-grok-4-fast
Imported 2026-05-28
92 Mistral Medium 2505 44.845% Imported 2026-05-28
93 Gemini 2.0 Flash 001 43.608% Gemini 2.0 Flash
google-gemini-2.0-flash
Imported 2026-05-28
94 GPT 4O 2024-11-20 43.444% GPT-4o (2024-11-20)
openai-gpt-4o-2024-11-20
Imported 2026-05-28
95 Labs Devstral Small 2512 43.178% Imported 2026-05-28
96 GPT 4.1 Nano 2025-04-14 42.718% GPT-4.1 Nano
openai-gpt-4.1-nano
Imported 2026-05-28
97 Grok 4.1 Fast Non Reasoning 42.622% GROK Grok 4.1 Fast
x-ai-grok-4.1-fast
Imported 2026-05-28
98 Claude 3 5 Haiku 20241022 41.918% Imported 2026-05-28
99 Gemini 1.5 Pro 002 41.719% Imported 2026-05-28
100 Claude Haiku 4.5 20251001 Thinking 41.175% Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-28
101 Grok 2.1212 38.679% Imported 2026-05-28
102 Llama 4 Scout 17B 16E Instruct 38.541% Llama 4 Scout
meta-llama-llama-4-scout
Imported 2026-05-28
103 Mistral Large 2411 37.088% Mistral Large 2411
mistralai-mistral-large-2411
Imported 2026-05-28
104 Gemini 1.5 Flash 002 36.91% Imported 2026-05-28
105 Llama 3.3 70B Instruct Turbo 36.341% Imported 2026-05-28
106 Llama 3.3 Nemotron Super 49B V1 42e84561 36.308% Imported 2026-05-28
107 Command A 03 2025 35.071% C Command A
cohere-command-a
Imported 2026-05-28
108 Mistral Small 2503 31.815% Imported 2026-05-28
109 GPT 4O Mini 2024-07-18 26.423% GPT-4o-mini (2024-07-18)
openai-gpt-4o-mini-2024-07-18
Imported 2026-05-28
110 Jamba Large 1.6 22.325% Imported 2026-05-28
111 Command R Plus 18.238% Imported 2026-05-28
112 Mistral Small 2402 15.781% Imported 2026-05-28
113 Jamba Mini 1.6 9.918% Imported 2026-05-28