SecCodeBench

Security benchmark for AI-generated and AI-repaired code, reporting secure-code repair and generation scores with and without hints.

28rows
total_scoreprimary metric
2026-05-28sampled

Metadata

Metrics

Total Score, Repair Score, Repair With Hints Score, Generation Score, Generation With Hints Score

Latest Results

Rows are imported from the official SecCodeBench 2.2.0 public leaderboard JSON. The site exposes 98 public test-case metadata records for this version.

Rank Subject Total Score Model Match Provenance Sampled
1 Qwen3.5-Plus(Thinking) 67.08% Qwen3.5 Plus 2026-04-20
qwen-qwen3.5-plus-20260420
Imported 2026-05-28
2 Claude Opus 4.6 64.9% Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-28
3 Qwen3-Coder-Next 62.76% Qwen3 Coder Next
qwen-qwen3-coder-next
Imported 2026-05-28
4 Gemini 3 Pro 62.42% Gemini 3
google-gemini-3
Imported 2026-05-28
5 GLM-5(Thinking) 62.13% GLM GLM 5
z-ai-glm-5
Imported 2026-05-28
6 Doubao-Seed-2.0-pro(Thinking:high) 61.98% Imported 2026-05-28
7 Kimi K2.5(Thinking) 61.25% KIMI MoonshotAI: Kimi K2.5
moonshotai-kimi-k2.5
Imported 2026-05-28
8 GPT-5.4 59.74% GPT-5.4
openai-gpt-5.4
Imported 2026-05-28
9 Doubao-Seed-2.0-Code(Thinking:high) 59.38% Imported 2026-05-28
10 Gemini 3 Flash 58.66% Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-28
11 GPT-5.2 58.23% GPT-5.2
openai-gpt-5.2
Imported 2026-05-28
12 Qwen3-Max-2026-01-23(Thinking) 57.45% Imported 2026-05-28
13 Claude Sonnet 4.5 56.83% Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-28
14 Qwen3-Coder-Plus-2025-09-23 56.44% Imported 2026-05-28
15 DeepSeek-V3.2(Thinking) 55.24% DeepSeek V3.2
deepseek-deepseek-v3.2
Imported 2026-05-28
16 Kimi K2.5(Non-thinking) 55.22% KIMI MoonshotAI: Kimi K2.5
moonshotai-kimi-k2.5
Imported 2026-05-28
17 Gemini 3.1 Pro 55.21% Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-28
18 DeepSeek-R1-0528 54.06% R1 0528
deepseek-deepseek-r1-0528
Imported 2026-05-28
19 Doubao-Seed1.8 53.47% Imported 2026-05-28
20 Claude Haiku 4.5 52.45% Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-28
21 GLM-4.7(Thinking) 52.29% GLM GLM 4.7
z-ai-glm-4.7
Imported 2026-05-28
22 DeepSeek-V3.2(Non-thinking) 51.81% DeepSeek V3.2
deepseek-deepseek-v3.2
Imported 2026-05-28
23 Qwen3-Max 51.23% Qwen3 Max
qwen-qwen3-max
Imported 2026-05-28
24 Doubao-Seed-Code 49.23% Imported 2026-05-28
25 MiniMax M2.5 46.31% MiniMax M2.5
minimax-minimax-m2.5
Imported 2026-05-28
26 GLM-4.7(Non-thinking) 44.64% GLM GLM 4.7
z-ai-glm-4.7
Imported 2026-05-28
27 MiMo-V2-Flash 40.59% MiMo-V2-Flash
xiaomi-mimo-v2-flash
Imported 2026-05-28
28 MiniMax M2.1 32.11% MiniMax M2.1
minimax-minimax-m2.1
Imported 2026-05-28