CAIS Risk Index

Composite CAIS AI Dashboard risk index averaging VCT refusal risk, HLE miscalibration, MASK risk, Machiavelli, and TextQuests Harm for models with all component scores. Lower is better.

37rows
scoreprimary metric
2026-05-27sampled

Metadata

Metrics

Risk Index (lower is better), VCT Refusal risk (lower is better), HLE miscalibration (lower is better), MASK risk (lower is better), Machiavelli risk (lower is better), TextQuests Harm (lower is better)

Latest Results

Imported from the public CAIS AI Dashboard bundle. Composite scores mirror the dashboard client: models missing any selected component are excluded; VCT Refusal and MASK are inverted for the Risk Index.

Rank Subject Risk Index Model Match Provenance Sampled
1 Opus 4.7 32.9 Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-27
2 Sonnet 4.5 34.1 Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-27
3 Opus 4.5 34.7 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-27
4 Grok 4.3 38.5 GROK Grok 4.3
x-ai-grok-4.3
Imported 2026-05-27
5 Sonnet 4.6 38.8 Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-27
6 Grok 4.2 38.8 GROK Grok 4.20
x-ai-grok-4.20
Imported 2026-05-27
7 Opus 4.6 40.7 Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-27
8 GPT-5.5 42.4 GPT-5.5
openai-gpt-5.5
Imported 2026-05-27
9 GPT-5.2 42.6 GPT-5.2
openai-gpt-5.2
Imported 2026-05-27
10 GPT-5.4 44.5 GPT-5.4
openai-gpt-5.4
Imported 2026-05-27
11 GPT-5.4-mini 44.9 GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-27
12 GPT-5.1 46.4 GPT-5.1
openai-gpt-5.1
Imported 2026-05-27
13 GPT-5 46.9 GPT-5
openai-gpt-5
Imported 2026-05-27
14 Haiku 4.5 47.0 Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-27
15 Grok 4 47.2 GROK Grok 4
x-ai-grok-4
Imported 2026-05-27
16 GPT-5.4-Nano 48.7 GPT-5.4 Nano
openai-gpt-5.4-nano
Imported 2026-05-27
17 GLM 5.1 50.3 GLM GLM 5.1
z-ai-glm-5.1
Imported 2026-05-27
18 GPT-5-mini 51.1 GPT-5 Mini
openai-gpt-5-mini
Imported 2026-05-27
19 GPT-5-Nano 51.9 GPT-5 Nano
openai-gpt-5-nano
Imported 2026-05-27
20 Grok 4.1 Fast 52.2 GROK Grok 4.1 Fast
x-ai-grok-4.1-fast
Imported 2026-05-27
21 DeepSeek 4 Pro 54.1 DeepSeek V4 Pro
deepseek-deepseek-v4-pro
Imported 2026-05-27
22 Grok 4 Fast 55.3 GROK Grok 4 Fast
x-ai-grok-4-fast
Imported 2026-05-27
23 Gemini 3.1 Pro 55.6 Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-27
24 DeepSeek 3.2 56.9 DeepSeek V3.2
deepseek-deepseek-v3.2
Imported 2026-05-27
25 Kimi K2 57.4 KIMI MoonshotAI: Kimi K2 Thinking
moonshotai-kimi-k2-thinking
Imported 2026-05-27
26 DeepSeek R1 57.4 R1
deepseek-r1
Imported 2026-05-27
27 Gemini 3.5 Flash 58.2 Gemini 3.5 Flash
google-gemini-3.5-flash
Imported 2026-05-27
28 Gemini 2.5 Pro 59.0 Gemini 2.5 Pro
google-gemini-2.5-pro
Imported 2026-05-27
29 Gemini 3 Flash 59.4 Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-27
30 Gemini 3 Pro 59.9 Gemini 3
google-gemini-3
Imported 2026-05-27
31 Gemini 2.5 Flash 60.1 Gemini 2.5 Flash
google-gemini-2.5-flash
Imported 2026-05-27
32 o3-mini 60.1 o3 Mini High
openai-o3-mini-high
Imported 2026-05-27
33 Kimi K2.5 61.5 KIMI MoonshotAI: Kimi K2.5
moonshotai-kimi-k2.5
Imported 2026-05-27
34 Gemini 3.1 Flash-Lite 61.7 Gemini 3.1 Flash Lite Preview
google-gemini-3.1-flash-lite-preview
Imported 2026-05-27
35 Kimi K2.6 63.0 KIMI MoonshotAI: Kimi K2.6
moonshotai-kimi-k2.6
Imported 2026-05-27
36 Gemini 2.5 Flash-Lite 66.4 Gemini 2.5 Flash Lite
google-gemini-2.5-flash-lite
Imported 2026-05-27
37 GPT-4o 67.0 GPT-4o (2024-11-20)
openai-gpt-4o-2024-11-20
Imported 2026-05-27