CyberSecEval
CyberSecEval: Measures model robustness, truthfulness, calibration, bias, harmfulness, jailbreak resistance, or alignment-relevant behavior.
Metadata
Metrics
Average Injection Success Rate (lower is better), Different User Input Language (lower is better), Output Formatting Manipulation (lower is better), Overload With Information (lower is better), Many Shot Attack (lower is better), Ignore Previous Instructions (lower is better), System Mode (lower is better), Few Shot Attack (lower is better), Indirect Reference (lower is better), Repeated Token Attack (lower is better), Persuasion (lower is better), Mixed Techniques (lower is better), Virtualization (lower is better), Payload Splitting (lower is better), Hypothetical Scenario (lower is better), Token Smuggling (lower is better)
| Rank | Subject | Average Injection Success Rate | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | codellama-70b-instruct | 12.93% | — | Imported | 2026-05-27 |
| 2 | gpt-4 | 19.87% | GPT-4 openai-gpt-4 | Imported | 2026-05-27 |
| 3 | llama 3 70b-instruct | 29.27% | Llama 3 70B Instruct meta-llama-llama-3-70b-instruct | Imported | 2026-05-27 |
| 4 | codellama-34b-instruct | 36.33% | — | Imported | 2026-05-27 |
| 5 | codellama-13b-instruct | 37.27% | — | Imported | 2026-05-27 |
| 6 | gpt-3.5-turbo | 39.13% | GPT-3.5 Turbo openai-gpt-3.5-turbo | Imported | 2026-05-27 |
| 7 | llama 3 8b-instruct | 45.27% | Llama 3 8B Instruct meta-llama-llama-3-8b-instruct | Imported | 2026-05-27 |
No matching rows.