DAXBench

Benchmark for evaluating how well language models write production-grade DAX for Power BI and Analysis Services business-intelligence scenarios.

50rows
scoreprimary metric
2026-05-28sampled

Metadata

Metrics

Score, Accuracy, Syntax, Tasks Solved

Latest Results

Rows are imported from the official DAXBench homepage table. The page states that the benchmark has 30 tasks and 123 tested models; the visible table exposes the top 50 rows.

Rank Subject Score Model Match Provenance Sampled
1 Gemini 3.1 Flash Lite Preview HIGH Google 97.4% Imported 2026-05-28
2 GPT-5.3 Chat OpenAI 96.2% GPT-5.3 Chat
openai-gpt-5.3-chat
Imported 2026-05-28
3 GLM 5 Z.AI 96.2% Imported 2026-05-28
4 GPT-5.4 Mini OpenAI 96.2% GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-28
5 Gemma 4 31B Google 94.5% Imported 2026-05-28
6 Gemini 3.1 Pro Preview HIGH Google 93.8% Imported 2026-05-28
7 Qwen3.6 Plus Preview (free) Qwen 93.3% Imported 2026-05-28
8 Qwen3.5-Flash MED Qwen 93.2% Imported 2026-05-28
9 GLM 5V Turbo Z.AI 91.5% Imported 2026-05-28
10 Qwen3.6 Max Preview Qwen 90.9% Imported 2026-05-28
11 GLM 5.1 Z.AI 90.3% Imported 2026-05-28
12 Qwen3.5 397B A17B Qwen 90.3% Imported 2026-05-28
13 Qwen3.5 Plus 2026-02-15 MED Qwen 89.7% Imported 2026-05-28
14 Qwen3.6 Plus (free) Qwen 89% Imported 2026-05-28
15 GPT-5.3-Codex HIGH OpenAI 88.6% GPT-5.3-Codex
openai-gpt-5.3-codex
Imported 2026-05-28
16 GPT-5.5 OpenAI 86.7% GPT-5.5
openai-gpt-5.5
Imported 2026-05-28
17 gpt-oss-120b OpenAI 85.6% Imported 2026-05-28
18 GPT-5.1-Codex-Max OpenAI 85% GPT-5.1-Codex-Max
openai-gpt-5.1-codex-max
Imported 2026-05-28
19 KAT-Coder-Pro V2 Kwaipilot 84.8% Imported 2026-05-28
20 Claude Sonnet 4.6 MED Anthropic 84.5% Imported 2026-05-28
21 GLM 5 Turbo Z.AI 84.3% Imported 2026-05-28
22 Gemini 3 Flash Preview Google 83.8% Imported 2026-05-28
23 Gemini 2.5 Flash Preview 09-2025 Google 83.8% Imported 2026-05-28
24 DeepSeek V4 Pro DeepSeek 83.2% Imported 2026-05-28
25 GPT-5.4 HIGH OpenAI 83.2% GPT-5.4
openai-gpt-5.4
Imported 2026-05-28
26 Qwen3.6 Flash Qwen 83.1% Imported 2026-05-28
27 Claude Opus 4.5 Anthropic 82.7% Imported 2026-05-28
28 Claude Opus 4.6 Anthropic 82% Imported 2026-05-28
29 Gemini 3.1 Flash Lite Google 82% Imported 2026-05-28
30 R1 DeepSeek 81.3% Imported 2026-05-28
31 Grok 4.3 xAI 81.3% Imported 2026-05-28
32 Claude Sonnet 4 Anthropic 81.3% Imported 2026-05-28
33 Gemini 3 Pro Preview Google 81.3% Imported 2026-05-28
34 o3 OpenAI 80.2% Imported 2026-05-28
35 Grok 4.20 Beta HIGH xAI 80.1% Imported 2026-05-28
36 GPT-5.2 Chat OpenAI 80.1% GPT-5.2 Chat
openai-gpt-5.2-chat
Imported 2026-05-28
37 DeepSeek V3.2 DeepSeek 79.9% Imported 2026-05-28
38 Qwen3.6 35B A3B Qwen 79.2% Imported 2026-05-28
39 GPT-5.2 OpenAI 78.4% GPT-5.2
openai-gpt-5.2
Imported 2026-05-28
40 Kimi K2 Thinking Moonshot AI 78.4% Imported 2026-05-28
41 Aurora Alpha Openrouter 78.2% Imported 2026-05-28
42 Claude Sonnet 4.5 Anthropic 77.9% Imported 2026-05-28
43 Grok 4.20 Multi-Agent Beta HIGH xAI 77.9% Imported 2026-05-28
44 Hunter Alpha Openrouter 77.8% Imported 2026-05-28
45 Grok 4 xAI 76.7% Imported 2026-05-28
46 Gemini 2.0 Flash Google 76.6% Imported 2026-05-28
47 Gemini 2.0 Flash Experimental (free) Google 76.6% Imported 2026-05-28
48 o4 Mini OpenAI 76.5% Imported 2026-05-28
49 Gemini 2.5 Flash Google 76% Imported 2026-05-28
50 DeepSeek V3.1 DeepSeek 75.9% Imported 2026-05-28