CAIS Risk Index
Composite CAIS AI Dashboard risk index averaging VCT refusal risk, HLE miscalibration, MASK risk, Machiavelli, and TextQuests Harm for models with all component scores. Lower is better.
37rows
scoreprimary metric
2026-05-27sampled
Metadata
Metrics
Risk Index (lower is better), VCT Refusal risk (lower is better), HLE miscalibration (lower is better), MASK risk (lower is better), Machiavelli risk (lower is better), TextQuests Harm (lower is better)
| Rank | Subject | Risk Index | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Opus 4.7 | 32.9 | Claude Opus 4.7 anthropic-claude-opus-4.7 | Imported | 2026-05-27 |
| 2 | Sonnet 4.5 | 34.1 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-27 |
| 3 | Opus 4.5 | 34.7 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-27 |
| 4 | Grok 4.3 | 38.5 | Grok 4.3 x-ai-grok-4.3 | Imported | 2026-05-27 |
| 5 | Sonnet 4.6 | 38.8 | Claude Sonnet 4.6 anthropic-claude-sonnet-4.6 | Imported | 2026-05-27 |
| 6 | Grok 4.2 | 38.8 | Grok 4.20 x-ai-grok-4.20 | Imported | 2026-05-27 |
| 7 | Opus 4.6 | 40.7 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-27 |
| 8 | GPT-5.5 | 42.4 | GPT-5.5 openai-gpt-5.5 | Imported | 2026-05-27 |
| 9 | GPT-5.2 | 42.6 | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-27 |
| 10 | GPT-5.4 | 44.5 | GPT-5.4 openai-gpt-5.4 | Imported | 2026-05-27 |
| 11 | GPT-5.4-mini | 44.9 | GPT-5.4 Mini openai-gpt-5.4-mini | Imported | 2026-05-27 |
| 12 | GPT-5.1 | 46.4 | GPT-5.1 openai-gpt-5.1 | Imported | 2026-05-27 |
| 13 | GPT-5 | 46.9 | GPT-5 openai-gpt-5 | Imported | 2026-05-27 |
| 14 | Haiku 4.5 | 47.0 | Claude Haiku 4.5 anthropic-claude-haiku-4.5 | Imported | 2026-05-27 |
| 15 | Grok 4 | 47.2 | Grok 4 x-ai-grok-4 | Imported | 2026-05-27 |
| 16 | GPT-5.4-Nano | 48.7 | GPT-5.4 Nano openai-gpt-5.4-nano | Imported | 2026-05-27 |
| 17 | GLM 5.1 | 50.3 | GLM 5.1 z-ai-glm-5.1 | Imported | 2026-05-27 |
| 18 | GPT-5-mini | 51.1 | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-27 |
| 19 | GPT-5-Nano | 51.9 | GPT-5 Nano openai-gpt-5-nano | Imported | 2026-05-27 |
| 20 | Grok 4.1 Fast | 52.2 | Grok 4.1 Fast x-ai-grok-4.1-fast | Imported | 2026-05-27 |
| 21 | DeepSeek 4 Pro | 54.1 | DeepSeek V4 Pro deepseek-deepseek-v4-pro | Imported | 2026-05-27 |
| 22 | Grok 4 Fast | 55.3 | Grok 4 Fast x-ai-grok-4-fast | Imported | 2026-05-27 |
| 23 | Gemini 3.1 Pro | 55.6 | Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview | Imported | 2026-05-27 |
| 24 | DeepSeek 3.2 | 56.9 | DeepSeek V3.2 deepseek-deepseek-v3.2 | Imported | 2026-05-27 |
| 25 | Kimi K2 | 57.4 | MoonshotAI: Kimi K2 Thinking moonshotai-kimi-k2-thinking | Imported | 2026-05-27 |
| 26 | DeepSeek R1 | 57.4 | R1 deepseek-r1 | Imported | 2026-05-27 |
| 27 | Gemini 3.5 Flash | 58.2 | Gemini 3.5 Flash google-gemini-3.5-flash | Imported | 2026-05-27 |
| 28 | Gemini 2.5 Pro | 59.0 | Gemini 2.5 Pro google-gemini-2.5-pro | Imported | 2026-05-27 |
| 29 | Gemini 3 Flash | 59.4 | Gemini 3 Flash Preview google-gemini-3-flash-preview | Imported | 2026-05-27 |
| 30 | Gemini 3 Pro | 59.9 | Gemini 3 google-gemini-3 | Imported | 2026-05-27 |
| 31 | Gemini 2.5 Flash | 60.1 | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-27 |
| 32 | o3-mini | 60.1 | o3 Mini High openai-o3-mini-high | Imported | 2026-05-27 |
| 33 | Kimi K2.5 | 61.5 | MoonshotAI: Kimi K2.5 moonshotai-kimi-k2.5 | Imported | 2026-05-27 |
| 34 | Gemini 3.1 Flash-Lite | 61.7 | Gemini 3.1 Flash Lite Preview google-gemini-3.1-flash-lite-preview | Imported | 2026-05-27 |
| 35 | Kimi K2.6 | 63.0 | MoonshotAI: Kimi K2.6 moonshotai-kimi-k2.6 | Imported | 2026-05-27 |
| 36 | Gemini 2.5 Flash-Lite | 66.4 | Gemini 2.5 Flash Lite google-gemini-2.5-flash-lite | Imported | 2026-05-27 |
| 37 | GPT-4o | 67.0 | GPT-4o (2024-11-20) openai-gpt-4o-2024-11-20 | Imported | 2026-05-27 |
No matching rows.