HoudiniVexBench

VEX code generation and understanding benchmark for SideFX Houdini shader/programming tasks, covering code completion, doc-to-code, and code explanation.

3rows
overall_scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Overall Score, Overall Latency (lower is better), Code Completion Score, Code Completion Latency (lower is better), Doc To Code Score, Doc To Code Latency (lower is better), Code Explanation Score, Code Explanation Latency (lower is better)

Latest Results

Rows are parsed from the public benchmark_results.json file. Source model IDs and display names are preserved.

Rank Subject Overall Score Model Match Provenance Sampled
1 Claude Opus 4.5 (Bedrock) 0.51 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-06
2 Gemini 3 Pro Preview 0.50 Gemini 3
google-gemini-3
Imported 2026-05-06
3 GPT-5.2 0.49 GPT-5.2
openai-gpt-5.2
Imported 2026-05-06