Structured Output Benchmark

SOB evaluates how accurately language models produce schema-compliant and value-correct JSON from normalized text contexts spanning text QA, OCR-derived documents, and meeting transcripts.

28rows
overallprimary metric
2026-05-06sampled

Metadata

Metrics

Overall, Value Accuracy, Faithfulness, JSON Pass, Path Recall, Structure Coverage, Type Safety, Perfect Response

Latest Results

Rows are parsed from the public Interfaze SOB leaderboard HTML table. Values are percentages as displayed by the source.

Rank Subject Overall Model Match Provenance Sampled
1 GPT-5.4 87 GPT-5.4
openai-gpt-5.4
Imported 2026-05-06
2 Gemini-3.1-Pro 86.90 Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-06
3 GLM-5.1 86.60 GLM GLM 5.1
z-ai-glm-5.1
Imported 2026-05-06
4 Claude-Opus-4.7 86.40 Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-06
5 GLM-4.7 86.10 GLM GLM 4.7
z-ai-glm-4.7
Imported 2026-05-06
6 Qwen3.5-35B 86.10 Imported 2026-05-06
7 GPT-5.5 86 GPT-5.5
openai-gpt-5.5
Imported 2026-05-06
8 Gemini-2.5-Flash 86 Gemini 2.5 Flash
google-gemini-2.5-flash
Imported 2026-05-06
9 Qwen3-235B 85.70 Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-06
10 Interfaze-Beta 85.50 Imported 2026-05-06
11 Claude-Sonnet-4.6 85.40 Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-06
12 Claude-Opus-4.6 85.30 Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-06
13 DeepSeek-V4-Pro 85.30 DeepSeek V4 Pro
deepseek-deepseek-v4-pro
Imported 2026-05-06
14 Kimi-2.6 85.30 Imported 2026-05-06
15 GPT-4.1 85 GPT-4.1
openai-gpt-4.1
Imported 2026-05-06
16 GPT-5 84.90 GPT-5
openai-gpt-5
Imported 2026-05-06
17 Gemma-3-27B 84.70 Gemma 3 27B
google-gemma-3-27b-it
Imported 2026-05-06
18 Qwen3-30B 84.20 Imported 2026-05-06
19 Nemotron-3-Nano-30B 84.10 Imported 2026-05-06
20 GPT-5-Mini 83.50 GPT-5 Mini
openai-gpt-5-mini
Imported 2026-05-06
21 Gemma-4-31B 83.30 Gemma 4 31B
google-gemma-4-31b-it
Imported 2026-05-06
22 Gemini-3-Flash-Preview 83.30 Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-06
23 Schematron-8B 83.20 Imported 2026-05-06
24 IBM-Granite-4.0 83.20 Imported 2026-05-06
25 Phi-4 83.10 Phi 4
microsoft-phi-4
Imported 2026-05-06
26 DS-R1-Distill-32B 82.70 Imported 2026-05-06
27 Ministral-3-14B 77.80 Imported 2026-05-06
28 GPT-OSS-20B 73.20 gpt-oss-20b
openai-gpt-oss-20b
Imported 2026-05-06