OASB Skills Security Benchmark

Open Agent Security Benchmark results for AI-agent skills security scanners, measuring precision, recall, F1, false-positive rate, flag rate, and scan latency.

3rows
f1primary metric
2026-05-27sampled

Metadata

Metrics

Precision, Recall, F1, Flag Rate, False Positive Rate (lower is better), Average Scan Time (lower is better)

Latest Results

Rows parsed from OASB's public benchmark-results-v5.json. The benchmark reports skills-security scanner precision, recall, F1, false-positive rate, and scan-time metrics.

Rank Subject F1 Model Match Provenance Sampled
1 NanoMind TME v0.5.0 (model only) 0.892 Imported 2026-05-27
2 HMA Full Pipeline (AST + NanoMind v0.5.0) 0.813 Imported 2026-05-27
3 HMA Static Patterns (no NanoMind) 0.675 Imported 2026-05-27