JSONSchemaBench
Structured-output benchmark measuring schema compliance and JSON validity for language models across easy, medium, and hard schema-constrained generation datasets.
45rows
schema_compliance_pctprimary metric
2026-05-28sampled
Metadata
Metrics
Schema Compliance, Schema Compliance CI (lower is better), JSON Validity, JSON Validity CI (lower is better)
| Rank | Subject | Schema Compliance | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | openAI/gpt-4o | 96.9% schema compliance | GPT-4o openai-gpt-4o | Imported | 2026-05-28 |
| 2 | openAI/gpt-4o-mini | 95.8% schema compliance | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-28 |
| 3 | Qwen/Qwen2.5-72B-Instruct | 95.5% schema compliance | Qwen2.5 72B Instruct qwen-qwen-2.5-72b-instruct | Imported | 2026-05-28 |
| 4 | Qwen/Qwen2.5-Coder-7B-Instruct | 94.3% schema compliance | — | Imported | 2026-05-28 |
| 5 | Qwen/Qwen2.5-32B-Instruct | 94.3% schema compliance | — | Imported | 2026-05-28 |
| 6 | Qwen/Qwen2.5-14B-Instruct | 92.1% schema compliance | — | Imported | 2026-05-28 |
| 7 | Qwen/Qwen2.5-Coder-3B-Instruct | 91.2% schema compliance | — | Imported | 2026-05-28 |
| 8 | meta-llama/Llama-3.1-8B-Instruct | 91.1% schema compliance | Llama 3.1 8B Instruct meta-llama-llama-3.1-8b-instruct | Imported | 2026-05-28 |
| 9 | openAI/gpt-4o | 89.6% schema compliance | GPT-4o openai-gpt-4o | Imported | 2026-05-28 |
| 10 | meta-llama/Llama-3.2-3B-Instruct | 88.3% schema compliance | Llama 3.2 3B Instruct meta-llama-llama-3.2-3b-instruct | Imported | 2026-05-28 |
| 11 | openAI/gpt-4o | 87.8% schema compliance | GPT-4o openai-gpt-4o | Imported | 2026-05-28 |
| 12 | Qwen/Qwen2.5-32B-Instruct | 86.9% schema compliance | — | Imported | 2026-05-28 |
| 13 | openAI/gpt-4o-mini | 86.2% schema compliance | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-28 |
| 14 | Qwen/Qwen2.5-72B-Instruct | 84% schema compliance | Qwen2.5 72B Instruct qwen-qwen-2.5-72b-instruct | Imported | 2026-05-28 |
| 15 | Qwen/Qwen2.5-3B-Instruct | 83.8% schema compliance | — | Imported | 2026-05-28 |
| 16 | microsoft/Phi-4-mini-instruct | 81.3% schema compliance | — | Imported | 2026-05-28 |
| 17 | Qwen/Qwen2.5-14B-Instruct | 80.1% schema compliance | — | Imported | 2026-05-28 |
| 18 | Qwen/Qwen2.5-Coder-7B-Instruct | 79.8% schema compliance | — | Imported | 2026-05-28 |
| 19 | microsoft/Phi-3.5-mini-instruct | 77.2% schema compliance | — | Imported | 2026-05-28 |
| 20 | meta-llama/Llama-3.1-8B-Instruct | 76% schema compliance | Llama 3.1 8B Instruct meta-llama-llama-3.1-8b-instruct | Imported | 2026-05-28 |
| 21 | Qwen/Qwen2.5-32B-Instruct | 74.7% schema compliance | — | Imported | 2026-05-28 |
| 22 | Qwen/Qwen2.5-Coder-3B-Instruct | 73.2% schema compliance | — | Imported | 2026-05-28 |
| 23 | google/gemma-3-1b-it | 69.9% schema compliance | — | Imported | 2026-05-28 |
| 24 | openAI/gpt-4o-mini | 68.5% schema compliance | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-28 |
| 25 | Qwen/Qwen2.5-3B-Instruct | 67.1% schema compliance | — | Imported | 2026-05-28 |
| 26 | Qwen/Qwen2.5-72B-Instruct | 66.7% schema compliance | Qwen2.5 72B Instruct qwen-qwen-2.5-72b-instruct | Imported | 2026-05-28 |
| 27 | meta-llama/Llama-3.2-1B-Instruct | 65.6% schema compliance | Llama 3.2 1B Instruct meta-llama-llama-3.2-1b-instruct | Imported | 2026-05-28 |
| 28 | meta-llama/Llama-3.2-3B-Instruct | 65.3% schema compliance | Llama 3.2 3B Instruct meta-llama-llama-3.2-3b-instruct | Imported | 2026-05-28 |
| 29 | microsoft/Phi-4-mini-instruct | 63.2% schema compliance | — | Imported | 2026-05-28 |
| 30 | microsoft/Phi-3.5-mini-instruct | 63.1% schema compliance | — | Imported | 2026-05-28 |
| 31 | Qwen/Qwen2.5-Coder-7B-Instruct | 60.5% schema compliance | — | Imported | 2026-05-28 |
| 32 | Qwen/Qwen2.5-7B-Instruct | 54.8% schema compliance | Qwen2.5 7B Instruct qwen-qwen-2.5-7b-instruct | Imported | 2026-05-28 |
| 33 | Qwen/Qwen2.5-14B-Instruct | 50.5% schema compliance | — | Imported | 2026-05-28 |
| 34 | Qwen/Qwen2.5-Coder-3B-Instruct | 42.5% schema compliance | — | Imported | 2026-05-28 |
| 35 | meta-llama/Llama-3.1-8B-Instruct | 42.2% schema compliance | Llama 3.1 8B Instruct meta-llama-llama-3.1-8b-instruct | Imported | 2026-05-28 |
| 36 | meta-llama/Llama-3.2-1B-Instruct | 39% schema compliance | Llama 3.2 1B Instruct meta-llama-llama-3.2-1b-instruct | Imported | 2026-05-28 |
| 37 | Qwen/Qwen2.5-3B-Instruct | 38.7% schema compliance | — | Imported | 2026-05-28 |
| 38 | google/gemma-3-1b-it | 37.1% schema compliance | — | Imported | 2026-05-28 |
| 39 | Qwen/Qwen2.5-7B-Instruct | 33.2% schema compliance | Qwen2.5 7B Instruct qwen-qwen-2.5-7b-instruct | Imported | 2026-05-28 |
| 40 | meta-llama/Llama-3.2-3B-Instruct | 32% schema compliance | Llama 3.2 3B Instruct meta-llama-llama-3.2-3b-instruct | Imported | 2026-05-28 |
| 41 | microsoft/Phi-4-mini-instruct | 28.5% schema compliance | — | Imported | 2026-05-28 |
| 42 | microsoft/Phi-3.5-mini-instruct | 28% schema compliance | — | Imported | 2026-05-28 |
| 43 | google/gemma-3-1b-it | 22.3% schema compliance | — | Imported | 2026-05-28 |
| 44 | meta-llama/Llama-3.2-1B-Instruct | 14.5% schema compliance | Llama 3.2 1B Instruct meta-llama-llama-3.2-1b-instruct | Imported | 2026-05-28 |
| 45 | Qwen/Qwen2.5-7B-Instruct | 5.38% schema compliance | Qwen2.5 7B Instruct qwen-qwen-2.5-7b-instruct | Imported | 2026-05-28 |
No matching rows.