DomainCodeBench

Domain-specific code generation benchmark across healthcare systems, financial algorithms, molecular simulation, and legal document processing, scored for functional correctness, compliance, domain API coverage, code quality, and reference similarity.

4rows
composite_scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Composite Score, Pass Rate, Domain API Coverage, Code Quality, Compliance, healthcare Composite, finance Composite, molecular sim Composite, legal Composite, easy Composite, medium Composite, hard Composite

Latest Results

Rows are parsed from the public DomainCodeBench leaderboard JSON. Source fractional scores are converted to percentages.

Rank Subject Composite Score Model Match Provenance Sampled
1 Qwen2.5-Coder-7B 89.77 Imported 2026-05-06
2 StarCoder2-15B 88.96 Imported 2026-05-06
3 Qwen2.5-Coder-3B 87.46 Imported 2026-05-06
4 CodeLlama-7B 73.84 Imported 2026-05-06