AgentHarm
AgentHarm: Measures model robustness, truthfulness, calibration, bias, harmfulness, jailbreak resistance, or alignment-relevant behavior.
39rows
harm_scoreprimary metric
2026-05-27sampled
Metadata
Metrics
Harm score (lower is better), Refusal rate, Non-refusal harm score (lower is better), Benign non-refusal score
| Rank | Subject | Harm score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Llama-3.1 8B (None) | 3.1% | — | Imported | 2026-05-27 |
| 2 | Llama-3.1 405B (None) | 4.3% | — | Imported | 2026-05-27 |
| 3 | Llama-3.1 405B (Template) | 4.3% | — | Imported | 2026-05-27 |
| 4 | Claude 3 Haiku (Template) | 6.6% | Claude 3 Haiku anthropic-claude-3-haiku | Imported | 2026-05-27 |
| 5 | Gemini 1.0 Pro (None) | 7.4% | — | Imported | 2026-05-27 |
| 6 | Claude 3 Haiku (None) | 11.1% | Claude 3 Haiku anthropic-claude-3-haiku | Imported | 2026-05-27 |
| 7 | Claude 3.5 Sonnet (None) | 13.5% | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-27 |
| 8 | Llama-3.1 70B (None) | 14.0% | — | Imported | 2026-05-27 |
| 9 | Claude 3 Opus (None) | 14.4% | — | Imported | 2026-05-27 |
| 10 | Llama-3.1 70B (Template) | 15.0% | — | Imported | 2026-05-27 |
| 11 | Gemini 1.5 Pro (None) | 15.7% | — | Imported | 2026-05-27 |
| 12 | Claude 3 Sonnet (None) | 20.7% | — | Imported | 2026-05-27 |
| 13 | Gemini 1.5 Flash (None) | 20.7% | — | Imported | 2026-05-27 |
| 14 | Gemini 1.0 Pro (Template) | 23.3% | — | Imported | 2026-05-27 |
| 15 | Claude 3.5 Sonnet (Forced tool call) | 26.9% | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-27 |
| 16 | Llama-3.1 8B (Template) | 27.5% | — | Imported | 2026-05-27 |
| 17 | Claude 3 Opus (Forced tool call) | 29.5% | — | Imported | 2026-05-27 |
| 18 | Claude 3 Haiku (Forced tool call) | 33.9% | Claude 3 Haiku anthropic-claude-3-haiku | Imported | 2026-05-27 |
| 19 | Claude 3 Sonnet (Forced tool call) | 42.8% | — | Imported | 2026-05-27 |
| 20 | Claude 3 Opus (Template) | 45.7% | — | Imported | 2026-05-27 |
| 21 | GPT-4o (None) | 48.4% | GPT-4o openai-gpt-4o | Imported | 2026-05-27 |
| 22 | Claude 3 Sonnet (Template) | 52.8% | — | Imported | 2026-05-27 |
| 23 | Gemini 1.5 Pro (Template) | 56.1% | — | Imported | 2026-05-27 |
| 24 | Gemini 1.5 Flash (Template) | 56.6% | — | Imported | 2026-05-27 |
| 25 | GPT-4o (Forced tool call) | 57.7% | GPT-4o openai-gpt-4o | Imported | 2026-05-27 |
| 26 | GPT-3.5 Turbo (Template) | 62.0% | GPT-3.5 Turbo openai-gpt-3.5-turbo | Imported | 2026-05-27 |
| 27 | GPT-3.5 Turbo (None) | 62.2% | GPT-3.5 Turbo openai-gpt-3.5-turbo | Imported | 2026-05-27 |
| 28 | GPT-4o mini (None) | 62.5% | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-27 |
| 29 | GPT-3.5 Turbo (Forced tool call) | 63.2% | GPT-3.5 Turbo openai-gpt-3.5-turbo | Imported | 2026-05-27 |
| 30 | GPT-4o mini (Forced tool call) | 68.4% | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-27 |
| 31 | Claude 3.5 Sonnet (Template) | 68.7% | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-27 |
| 32 | GPT-4o mini (Template) | 68.8% | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-27 |
| 33 | Mistral Small 2 (None) | 72.0% | — | Imported | 2026-05-27 |
| 34 | GPT-4o (Template) | 72.7% | GPT-4o openai-gpt-4o | Imported | 2026-05-27 |
| 35 | Mistral Small 2 (Template) | 72.7% | — | Imported | 2026-05-27 |
| 36 | Mistral Small 2 (Forced tool call) | 73.7% | — | Imported | 2026-05-27 |
| 37 | Mistral Large 2 (Template) | 80.5% | — | Imported | 2026-05-27 |
| 38 | Mistral Large 2 (Forced tool call) | 80.9% | — | Imported | 2026-05-27 |
| 39 | Mistral Large 2 (None) | 82.2% | — | Imported | 2026-05-27 |
No matching rows.