SecCodeBench
Security benchmark for AI-generated and AI-repaired code, reporting secure-code repair and generation scores with and without hints.
28rows
total_scoreprimary metric
2026-05-28sampled
Metadata
Metrics
Total Score, Repair Score, Repair With Hints Score, Generation Score, Generation With Hints Score
| Rank | Subject | Total Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Qwen3.5-Plus(Thinking) | 67.08% | Qwen3.5 Plus 2026-04-20 qwen-qwen3.5-plus-20260420 | Imported | 2026-05-28 |
| 2 | Claude Opus 4.6 | 64.9% | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-28 |
| 3 | Qwen3-Coder-Next | 62.76% | Qwen3 Coder Next qwen-qwen3-coder-next | Imported | 2026-05-28 |
| 4 | Gemini 3 Pro | 62.42% | Gemini 3 google-gemini-3 | Imported | 2026-05-28 |
| 5 | GLM-5(Thinking) | 62.13% | GLM 5 z-ai-glm-5 | Imported | 2026-05-28 |
| 6 | Doubao-Seed-2.0-pro(Thinking:high) | 61.98% | — | Imported | 2026-05-28 |
| 7 | Kimi K2.5(Thinking) | 61.25% | MoonshotAI: Kimi K2.5 moonshotai-kimi-k2.5 | Imported | 2026-05-28 |
| 8 | GPT-5.4 | 59.74% | GPT-5.4 openai-gpt-5.4 | Imported | 2026-05-28 |
| 9 | Doubao-Seed-2.0-Code(Thinking:high) | 59.38% | — | Imported | 2026-05-28 |
| 10 | Gemini 3 Flash | 58.66% | Gemini 3 Flash Preview google-gemini-3-flash-preview | Imported | 2026-05-28 |
| 11 | GPT-5.2 | 58.23% | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-28 |
| 12 | Qwen3-Max-2026-01-23(Thinking) | 57.45% | — | Imported | 2026-05-28 |
| 13 | Claude Sonnet 4.5 | 56.83% | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-28 |
| 14 | Qwen3-Coder-Plus-2025-09-23 | 56.44% | — | Imported | 2026-05-28 |
| 15 | DeepSeek-V3.2(Thinking) | 55.24% | DeepSeek V3.2 deepseek-deepseek-v3.2 | Imported | 2026-05-28 |
| 16 | Kimi K2.5(Non-thinking) | 55.22% | MoonshotAI: Kimi K2.5 moonshotai-kimi-k2.5 | Imported | 2026-05-28 |
| 17 | Gemini 3.1 Pro | 55.21% | Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview | Imported | 2026-05-28 |
| 18 | DeepSeek-R1-0528 | 54.06% | R1 0528 deepseek-deepseek-r1-0528 | Imported | 2026-05-28 |
| 19 | Doubao-Seed1.8 | 53.47% | — | Imported | 2026-05-28 |
| 20 | Claude Haiku 4.5 | 52.45% | Claude Haiku 4.5 anthropic-claude-haiku-4.5 | Imported | 2026-05-28 |
| 21 | GLM-4.7(Thinking) | 52.29% | GLM 4.7 z-ai-glm-4.7 | Imported | 2026-05-28 |
| 22 | DeepSeek-V3.2(Non-thinking) | 51.81% | DeepSeek V3.2 deepseek-deepseek-v3.2 | Imported | 2026-05-28 |
| 23 | Qwen3-Max | 51.23% | Qwen3 Max qwen-qwen3-max | Imported | 2026-05-28 |
| 24 | Doubao-Seed-Code | 49.23% | — | Imported | 2026-05-28 |
| 25 | MiniMax M2.5 | 46.31% | MiniMax M2.5 minimax-minimax-m2.5 | Imported | 2026-05-28 |
| 26 | GLM-4.7(Non-thinking) | 44.64% | GLM 4.7 z-ai-glm-4.7 | Imported | 2026-05-28 |
| 27 | MiMo-V2-Flash | 40.59% | MiMo-V2-Flash xiaomi-mimo-v2-flash | Imported | 2026-05-28 |
| 28 | MiniMax M2.1 | 32.11% | MiniMax M2.1 minimax-minimax-m2.1 | Imported | 2026-05-28 |
No matching rows.