BLXBench

Community benchmark runner and public leaderboard for AI model performance across coding, debugging, reasoning, hallucination, refactoring, security, and speed slices.

25rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Passed Tests, Total Tests, Pass Rate, Average Latency (lower is better), Decode Throughput, Slice Cost (lower is better)

Latest Results

Rows ranked by the source leaderboard order.

Rank Subject Score Model Match Provenance Sampled
1 Grok 4.3 85.50 GROK Grok 4.3
x-ai-grok-4.3
Imported 2026-05-06
2 Claude Opus 4.7 84.80 Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-06
3 Gpt Chat Latest 83.80 Imported 2026-05-06
4 Owl Alpha 83.60 Owl Alpha
openrouter-owl-alpha
Imported 2026-05-06
5 Qwen3.6 Flash 82.80 Qwen3.6 Flash
qwen-qwen3.6-flash
Imported 2026-05-06
6 Grok 4.20 79.10 GROK Grok 4.20
x-ai-grok-4.20
Imported 2026-05-06
7 Gpt Mini Latest 78.30 OpenAI GPT Mini Latest
openai-gpt-mini-latest
Imported 2026-05-06
8 Ling 2.6 1t 75.30 I Ling-2.6-1T
inclusionai-ling-2.6-1t
Imported 2026-05-06
9 Mistral Small 2603 75.20 Mistral: Mistral Small 4
mistralai-mistral-small-2603
Imported 2026-05-06
10 Claude Opus 4.6 71.10 Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-06
11 Gpt 5.5 65.90 GPT-5.5
openai-gpt-5.5
Imported 2026-05-06
12 Granite 4.1 8b 64.60 Granite 4.1 8B
ibm-granite-granite-4.1-8b
Imported 2026-05-06
13 Deepseek V4 Flash 48.30 DeepSeek V4 Flash
deepseek-deepseek-v4-flash
Imported 2026-05-06
14 Gpt 5.5 Pro 44.50 GPT-5.5 Pro
openai-gpt-5.5-pro
Imported 2026-05-06
15 Mimo V2.5 Pro 44 MiMo-V2.5-Pro
xiaomi-mimo-v2.5-pro
Imported 2026-05-06
16 Qwen3.6 35b A3b 33.90 Qwen3.6 35B A3B
qwen-qwen3.6-35b-a3b
Imported 2026-05-06
17 Nemotron 3 Nano Omni 30b A3b Reasoning 29.90 Nemotron 3 Nano Omni
nvidia-nemotron-3-nano-omni-30b-a3b-reasoning
Imported 2026-05-06
18 Qwen3.6 27b 29 Qwen3.6 27B
qwen-qwen3.6-27b
Imported 2026-05-06
19 Mimo V2.5 15.70 MiMo-V2.5
xiaomi-mimo-v2.5
Imported 2026-05-06
20 Kimi K2.6 15.40 KIMI MoonshotAI: Kimi K2.6
moonshotai-kimi-k2.6
Imported 2026-05-06
21 Deepseek V4 Pro 15.20 DeepSeek V4 Pro
deepseek-deepseek-v4-pro
Imported 2026-05-06
22 Glm 5.1 13.90 GLM GLM 5.1
z-ai-glm-5.1
Imported 2026-05-06
23 Minimax M2.7 4.60 MiniMax M2.7
minimax-minimax-m2.7
Imported 2026-05-06
24 Gemini 3.1 Pro Preview 3.70 Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-06
25 Gemini Pro Latest 2.90 Google Gemini Pro Latest
google-gemini-pro-latest
Imported 2026-05-06