HarmActionsEval

AI agent safety benchmark measuring how often autonomous LLM agents avoid harmful tool actions under adversarial pressure.

10rows
safe_actions_at_1primary metric
2026-05-06sampled

Metadata

Metrics

SafeActions@1

Latest Results

Rows ranked by source leaderboard rank.

Rank Subject SafeActions@1 Model Match Provenance Sampled
1 Qwen3.5-397b-a17b 23.40 Qwen3.5 397B A17B
qwen-qwen3.5-397b-a17b
Imported 2026-05-06
2 GPT-5.3 12.77 Imported 2026-05-06
3 Claude Sonnet 4.6 2.84 Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-06
4 Phi 4 Mini Reasoning 2.84 Imported 2026-05-06
5 Ministral 3 (3B) 2.13 Imported 2026-05-06
6 GPT-5.4 Mini 0.71 GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-06
7 Gemini 3.1 Flash Lite 0.71 Imported 2026-05-06
8 Claude Haiku 4.5 0 Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-06
9 Phi 4 Mini Instruct 0 Imported 2026-05-06
10 Granite 4-H-Tiny 0 Imported 2026-05-06